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HOW TO DEVELOP AN INTEREST IN ONE’S TASKS 

AND WORK 

WILLIAM F. BOOK 

Indiana University 
Prior to 1924, when the new teacher-training law in Indiana went 
into effect, educational psychology at Indiana University was an 
elective course taken by students who had had an introductory 
course in elementary psychology and who were preparing to teach. 
Since 1924 it has been a required course for every student who expects t 
to teach or who is qualifying for any kind of teaching certificate. 
This requirement has brought about a marked change in the mental 
attitude of the students taking this course and in the interest with 

which they pursue the work. 

At the beginning of the second semester this year (1925-26) the 














— 147 students enrolled in this course in the writer’s classes were asked 
to state in writing (1) their real purpose in taking this course; and (2) " 
| what they expected to get out of the course. Sixty-six per cent of ‘ 


these students stated that they took it because “they had to,” or 
because “it was required to get a license.” Twelve per cent stated 
that they were taking it because they liked psychology; 1 per cent i 
because they liked education. The character of the answers given by i 
the 12 per cent who stated that they took it to prepare to teach indi- | 
cated that the dominant idea in their minds was the same as for the ae: 
66 per cent; 7.e., they were taking the course because it was required, ‘ 
not because they were interested in the subject or wanted the work. 
Only 22 per cent of the students enrolled were interested in the 
course when the work began. 

The answers to the second question only strenghtened this inter- 
pretation of the general attitude of students towards the course which 
had been found in previous semesters to be a serious handicap to the ‘ 
instructor in making the work of the course truly helpful to the stu- wi 
dents. That is to say the students now taking this course are not P 


interested in the course and enter the class with the wrong attitude 
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towards the work. This makes it very difficult to get the quality and 
amount of work from the students taking the course that should be 
done, a condition that is in sharp contrast to the condition which main- 
tained two years ago when an attitude of intense interest in the work 
was the rule and when it was easy to get first class work from prac- 
tically every student enrolled in the course. Why this change in 
interest and attitude towards the work? And what may be done to 
meet such a situation when it arises? Or stated in another way, how 
may a student’s attitude towards his tasks be made favorable and a 
genuine interest in his work developed? 

Before attempting to answer these questions let us inquire what 
interest really is and how such wrong attitudes towards any school 
subject as the one described are usually acquired. 


/ Wuat INTEREST REALLY Is 


Not everyone, not even a psychologist, knows exactly what inter- 
est is or what is meant when we usethisterm. Interest has been 
defined as: ‘‘the impulse to attend” ; as ‘the recognition of a thing that 
has been vitally connected with our experience before, a thing that 
is familiar or old”; or ‘‘as a natural tendency to act.’”’ One psycholo- 
gist says ‘“‘the root idea of the term seems to be that of being engaged, 
engrossed, or entirely taken up with some activity because of its recog- 
nized worth.”’ Dewey states that it marks the annihilation of the 
distance between the person interested and the materials and results 
of his action. According to him, interest is the sign of this organic 
union between the subject possessing the interest and the objects or 
materials he works upon. Other writers have made interest synony- 
mous with the feeling of satisfaction or pleasure that always accom- 
panies spontaneous and successful attention. Sucha feeling of pleasure 
may be said to be the sign in our consciousness which notifies us that we 
are in reality growing interested in a subject or task. 

These definitions will help us to understand and to keep in mind 
what interest really is. 


IMPORTANCE OF INTEREST OR Errect Wuicu It Propucres ON ONE’S 
ABILITY TO WorRK 


Such an interest in one’s tasks is very important as every one knows 
and should be developed and carefully maintained while we study or 
do our other work. In fact, one of the most necessary conditions 
for obtaining human efficiency in any line of work is to cultivate such 
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an interest in whatever we have to do; and if we do not feel such a 
natural interest in our work to take steps to develop it at once, for keen 
interest is one of the most important driving forces in life. 

This may be illustrated in a number of ways. A stentor busily 
engaged in the pursuit of food or if reacting to a particular kind of 
stimulus, will not respond to another stimulus that under other condi- 
tions would elicit a definite response. At such a time its organism 
seems physiologically “set’’ towards a certain stimulus and towards a 
particular type of movement so that a stimulus which normally excites 
it to action has no effect upon its behavior. 

The same thing has been shown by experiments on animals. A 
hungry cat or rat placed in a puzzle box or maze shows by its every 
movement a mental and physiological adjustment that usually persists 
until it succeeds in getting out of the box. A sleepy or well-fed cat 
will, on the other hand, be “set” less towards escape movements 
when confined in such a puzzle box or maze. In such instances a 
certain psycho-physical adjustment has been set up within the 
organism that helps to determine what responses the animal will make 
and which creates a sort of inner urge that drives it on to make a type 
of movement that is continued until the animal obtains an end that 
may be said to be desired only in a physiological way. Or if put in 
terms of attention, or interest, we may say that the cat is very much 
interested in getting out of the box and will usually stick to the attempt 
until she succeeds. 

Something like this, only on a far bigger and more complicated scale, 
is what happens with students, only here a purpose, an idea, a 
desire, some mental attitude or general feeling tone serves as such an 
inner urge, and helps to determine what sort of responses the individual 
will make when confronted by his tasks. Anything he is intensely 
interested in or is “‘set on doing,” he usually finds a way to accomplish. 
if there is no way, he proceeds to make one. 

This has recently been illustrated by a number of important experi- 
ments and incidents from practical life. Rose Fritz won the world’s 
typewriting contest in 1906 by writing 82 correct words each minute 
for one hour. She told Mr. Kimball, the director of the contest, that 
she had reached the human limit for this type of work—that she had 
written as fast as any human being could ever learn to typewrite. 

Mr. Kimball, however, told her she was wrong. He assured her 
that she had not reached the limit, and convinced her that her own 
record could be passed. The result was that in the following year she 
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beat her own record by writing 87 correct words each minute for an 
hour; and the following year she raised her record to 95 words, while 
today the world’s record is held by Albert Tangora who wrote, in 
October, 1924, at the rate of 147 correct words each minute for one hour. 

A study by H. D. Kitson of the increased output of 40 hand com- 
positors in the printing industry showed that when these workers 
were subjected to a particular wage incentive which interested them 
in doing more and better work, these seasoned printers who had been 
working at the trade on an average 10.3 years made an average 
gain in output of work of 78 per cent in five months. One man who 
had worked at the trade 23 years increased his output 289 per cent. 

The results obtained in our own study of five different types of 
learning made with 124 college students emphasized in a still more 
striking way the value of interest in learning and work. Our experi- 
ments were devised to determine the effect which interest in improve- 
ment produced upon the progress in learning made by these subjects. 
In each type of learning studied, the sections that were being interested 
in the gains they were making, by the conditions of the experiment, 
made more rapid progress than did the sections that could 
not because of the conditions imposed by the experiment become 
interested in their own advancement. Moreover when the conditions 
were reversed for the various stimulus and control sections, the same 
individuals who had been making rapid progress when they were 
directly interested in their advancement ceased suddenly to improve, 
while the sections that had been making only slight improvement 
began suddenly to progress at a much more rapid rate as soon as they 
were definitely interested by the conditions of the experiment in the 
gains they were making. In each of the five types of learning studied 
this interest in improvement not only increased the learner’s rate of 
gain but materially improved the quality of his work.’ 

Similar results were obtained by Remmers and Knight in an 
experiment made on college students. Ten college freshmen who had 
been previously hazed by their fraternity for a week and allowed to 
sleep only two hours a night were given some tests in addition at 10 
P. M. and their results compared with the scores made by an equal 
number of Junior students of equal ability who were tested under the 
most favorable classroom conditions at 8 A. M. The object of |the 





1 Compare Book, W. F. and Norvelle, Lee: The Will to Learn. Pedagogical 
Seminary, Vol. XXIX, 1922, pp. 305-362. Also Book, W. F.: “Learning How to 
Study and Work Effectively.” Ginn and Co., Publishers, 1926, Chapter XVI. 
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experiment was to determine how much difference it would make in 
performing successfully this kind of work when the performers were 
keenly interested in the work. The freshmen were told by the presi- 
dent of their fraternity that they must push themselves to the limit 
on these tests as the score they made would determine their final 
acceptance or rejection by the fraternity. The juniors, on the other 
hand, had no special interest in the task or in their final score but were 
told, the same as the freshmen, to do their best on every test. 

Under the conditions of this experiment the freshmen made an 
average of 21 columns of addition for each five minute period of work 
in all the tests while the juniors made an average of only 11 columns 
per test. In other words the interest or special motivation which 
these freshmen students had, not only offsets the extreme fatigue 
produced by the hazing and the loss of sleep but further offset 
freshmen vs. junior ability, and in addition produced twice as much 
work per unit of time with equal accuracy. 

Other examples of the effect which such an interest regularly pro- 
duces might be cited. For example, one of the biggest manufacturers 
in Chicago told his office employees a few years ago that they could 
have all day Saturday every week for a holiday if they would complete 
all their work in the remaining five days. They did it easily after 
that and have enjoyed a five-day week for several years. 

These experiments indicate the effect which interest in a task or in 
one’s own advancement really produces and show why we should 
learn how to become interested in our tasks and work, for interest 
is a necessary condition for doing any task exceedingly well. We 
should, therefore, next ask why interest aids a learner or worker in 
these ways and how such an interest in a particular subject or task 
like elementary or educational psychology may be developed in an 
individual or class. 


How INTERESTs ARE REALLY ACQUIRED 


Our early and strongest interests are, of course, all hereditary, 
but as Professor James has pointed out, most of the interests of an 
adult have been acquired during the course of his experience and train- 
ing. ‘An adult man’s interests,” he says, ‘are almost every one of 
them intensely artificial; they have been slowly built up. The 
objects of our professional interests are most of them in their original 
nature repulsive; but by their connection with such natively exciting 
objects as one’s personal fortune, one’s social responsibilities, and 
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especially by the force of inveterate habit, they grow to be the only 
things for which in middle life a man profoundly cares.” 

This gives us the key to the cultivation of any interests which we 
may desire to develop in our pupils or in ourselves. We can make our 
interests practically anything we will if only we know how, for the 
process takes place according to a few very definite laws. The first 
and foremost of these laws is the fact that all new and acquired interests 
must be built upon the native interests or tendencies to response with which 
an individual is already endowed. 

This means that if one wants to develop an interest in a particular 
subject or task he must first of all find something about this new subject 
or thing that already interests him. For example, the writer once 
aroused a genuine interest in Latin in a high school boy who was failing 
in this subject and had learned to thoroughly detest it. In a confer- 
ence we found that this boy was intensely interested in being a physi- 
cian and had his future career as a surgeon rather carefully planned. 
He put in all his spare time loafing at a drug store up town and spent 
much time visiting a doctor who had taken a personal interest in him. 
“Harry,” the writer said one afternoon when school was dismissed, 
““T want you to go to Shaptaugh’s drug store,’”’ naming the place where 
he regularly spent his time after school, ‘‘ and copy the names of 50 
drugs from the bottles on the shelf and bring me the list tomorrow 
afternoon. Then I want you to go to Dr. Anderson’s office—Harry’s 
favorite physician—and copy from his medical books the names of 50 
important diseases and bring that list of names with the other list.” 

Little more was needed. Practically every name Harry had on the 
two lists was either a Latin term or a word of direct Latin origin. 
This fact interested him at once and was a sufficient incentive for 
him to apply himself, in a gingerly way at first, to the Latin he had 
learned to detest. As soon as he began to apply himself to the work 
he not only began to succeed but became genuinely interested because 
of the action of a second law that governs the acquisition of allinterests 
that we ever acquire. 

This law may be stated as follows: ‘‘In order to develop an interest 
in a subject or task secure much information about it.” This principle 
is appealed to when an advertiser or salesman gives his prospective 
customers important information about the article he wishes them to 
buy. Itis the law that has operated in building up in every scientist 
the keen interest which he has in his speciality. That is to say a 
scientist’s interest consists in large measure of the knowledge that he 
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possesses in a particular field. The writer, for example, was not always 
interested in psychology, but by the steady accumulation of facts in 
this field an interest in this subject has been gradually developed that 
amounts almost to passionate absorption, an interest that has com- 
pletely overshadowed, even eliminated most of the native interests 
that formerly dominated his conduct and thought. 

If a given piece of work or subject does not in and of itself interest 
us or our students we should therefore stimulate them to think with 
intense enthusiasm of the other desirable things to which it may be a 
means, and to begin at once to get more information about the prob- 
lems in this field. The former method was used to develop an interest 
in the 40 printers studied by Dr. Kitson, who made such marked 
advancement in developing more efficient methods of doing their work. 
When this experiment began these men had been engaged in the print- 
ing trade for an average of 10.3 years. A record of the average amount 
of work done was taken at the end of the first week after they began 
to work in the plant, and before the special wage incentive used to 
interest them in learning to do more and better work began to urge 
them on to improve their methods of work. The records made 
during their fourth, eighth, twelfth, sixteenth, and twentieth weeks of 
service were used to measure any improvement in skill that might have 
taken place during this time. 

One of the rules of this particular industry was that every employee 
could get more pay for all the work done over a certain amount which 
was placed, after much experimentation at a point considerably above 
the skill of the average journeyman printer in the trade. As a result 
every one of the 40 men whose record was studied boosted his output 
above this bonus point, the average rate of improvement being 78 
per cent in five months. In other words, the average rate of work of 
these seasoned printers had been only about three-fourths of what they 
could easily learn to do if sufficiently interested in making further 
improvement in their methods of work. 

(A third law governing the acquisition of interest is the fact that 
one must arrange matters so that he can and will succeed with all his 
work. No genuine or lasting pleasure can be attached to a subject, or 
to the performance of a task, unless the worker is succeeding with his 
tasks. New things must be stated in terms of the old. The unknown 
in terms of the known. Unless this principle is followed and the worker 
is kept succeeding most of the time no genuine interest in the subject 
can be developed. In other words success is required to make a 
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student exert himself fully and vigorously towards his tasks which is the 
final and most important law that controls the development of what 
is generally called interest. 

The operation of this last law in the development of interest is well 
illustrated by a story recently told the writer by one of the leading 
superintendents of schools in the United States. In discussing how 
interest in a particular school subject might be developed, he related 
how the best teacher he ever had proceeded to develop it in a class in 
general science in the high school. This teacher, he said, always began 
with an object that was thoroughly familiar to every member of his 
class. One day it was a bean which they put to soak and watched 
sprout and grow during succeeding days. While this was going on 
they dissected another bean and determined the structure and function 
of its several parts: the hard outer shell, the inner covering, and the 
food portion stored inside these coverings, which the young plant 
consumed as it grew and developed a way of extracting its food from 
the soil and the air. 

The next week this teacher asked each member of the class to bring 
a hard boiled egg for inspection and study. All now readily discovered 
on the basis of what they had done with the bean, the hard protective 
shell on the outside, the inner lining, and the mass of food on the 
inside for the embryo chick which they later watched develop in 
another egg as the experiment proceeded, discovering for themselves 
a principle which governs an important life process in both the plant 
and the animal world. 

On another occasion the class was given a bit of information about 
a certain plant that was dissected and studied in class, and two of the 
pupils were asked to go to a certain swamp some seven miles distant 
to secure for class use the next day a closely related plant, the skunk 
cabbage, which they were told was exceedingly rare in that community. 
Just enough information had been given these boys to arouse their curi- 
osity about the new plant and to enable them to identify it by its pecul- 
iar odor. Their native and previously acquired interests were appealed 
to by having them use the principal’s horse and buggy for the seven 
mile drive after the plant, and by the fact that they were made respon- 
sible for securing a new specimen for class use on the followmg day. 

That these and a score of other similar incidents were remembered 
by this man and by all living members of his class for 30 years when 
the author of our story recalled them in that city in a commencement 
address is proof of the keen interest which this great teacher had 
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developed by the simple rules already discussed. (1) First, he had 
them secure definite information about each subject and object they 
studied. (2) He always stated the new in terms of what they already 
knew or had previously done. (3) He made it possible for them to 
succeed with everything he asked them todo. And (4) he always had 
his students actively engaged on some project or bit of work, exerting 
themselves actively towards each problem they took up for study. 
In fact, so successfully was this done and so great was the interest 
aroused in each subject taken up in the class that this superintendent’s 
younger brother and mother joined in practically every experiment 
that was performed in school, an experience which was duplicated in 
practically every home represented in the class, I have been told. 


How To DEVELOP AN INTEREST IN A PARTICULAR SuBJECT OR TASK 


Such methods will develop an interest in any subject or task. 
If, for example, you should desire to develop an interest in educational 
psychology in a class of college students you must somehow keep them 
actively engaged in working problems in this field. You must help 
them apply the knowledge so gained to the study of other subjects 
and to their own study and work. You must encourage them to 
choose topics from this field for themes in English Composition, or 
for conversation and discussion with other members of the class. 
Your must encourage and require them to take an active part in the 
class discussions, not just lecture to them. The problem consists of 
keeping them actively engaged in solving problems or in working 
projects connected with the subject-matter treated in the course and 
above all you must arrange the work and gauge the assignments so 
that they can and will succeed with all the work you ask them to do. 
Presently you shall find that this subject will ‘not only lose all its dis- 
agreeable features but fairly bristle with significance and interest, an 
attitude that serves as a stimulus to urge your students on to still 
greater activity and success in this direction which, according to the 
laws already discussed, will create still further enthusiasm and 
interest, the chief factor in producing success in every field of work. 


GENERAL CONCLUSIONS AND RESULTS OF THE CLASS EXPERIMENT 


We may now, in closing, state briefly what was done at Indiana 
University to try to change the unfavorable attitude towards the 
educational psychology mentioned in the opening paragraph of this 
report, also state the results obtained in this experiment, and one con- 
clusion that might be drawn on the basis of the facts. 
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By means already described it was determined what the attitude 
of the class in question actually was towards this subject when the 
work they had to do began. We next had them make a careful study 
of the results that had been obtained in the experiments made by other 
investigators to ascertain the effect that one’s attitude towards his 
tasks and work actually has upon his ability to do this work. We 
next worked out what the most favorable attitude towards success, 
towards one’s tasks, and towards one’s own advancement really is and 
determined from the experiments made in the field how much effect 
these factors and a genuine interest in one’s work actually produce 
when measured in terms of the student’s ability to do his work. We 
also helped them to work out how and why these effects were produced. 
These students were next given references to read which explained 
exactly what had to be done to develop such an interest in a particular 
task or in their work as a whole, and this problem carefully discussed 
and illustrated in class. 

This changed their attitude completely. They began to apply 
themselves successfully to the early assignments made in the class, 
which were very carefully worked out by the instructor and enough 
direction given to enable each student to succeed with every task that 
we asked him to perform. Care was also taken to point out the prac- 
tical value and personal significance of the various problems discussed 
in the class, so far as this could be done. The result was a complete 
change of attitude, a growing interest in the work of the course and in 
the problems that were considered from day to day. In fact this 
method restored the attitude and spirit of these laggard sections to 
what it had been two years before when the course was an elective and 
a delight to teach. 

It would therefore seem that college students should often be shown 
by a careful study of the experimental results bearing on the subject 
the effect of acquiring and maintaining a favorable attitude towards 
their tasks and work, the normal effects of maintaining a proper atti- 
tude towards their success and towards their own advancement; 
that they should often be shown that such an interest in their work, a 
proper attitude towards success, and towards their own progress 
actually helps them in doing the work they are required to do. If in 
addition they can be shown exactly how they must proceed to develop 
such helpful interests and attitudes towards their tasks this group of 
psychological factors may be fully utilized in bringing about the changes 
in these students which the teacher desires to make. 
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THE WIDE DIVERSITIES OF PRACTICE IN FIRST 
COURSES IN EDUCATIONAL PSYCHOLOGY’ 


DEAN A. WORCESTER 
The Ohio State University 


In a recent number of this Journal, O. B. Douglas* has reported 
some very interesting facts concerning the \present status of the first 
course in Educational Psychology, facts which seem to indicate 
a rather high degree of uniformity of practice as to this course. How- 
ever, a closer scrutiny of some of these facts together with the examina- 
tion of some additional data reveal a situation in which uniformity is 
rather startlingly lacking. The present study may perhaps be 
thought of as supplementary of rather than in opposition to that of 
Douglas—but it certainly suggests that some of his conclusions will 
probably require modifications in the light of further evidence. 

Douglas’ findings, so far as they concern us here, may be briefly 
summarized: 

In 15 of the 61 institutions reporting, the course in Educational 
Psychology is in the department of Psychology and in 46 it is not. 

Forty-five of the 65 institutions reporting require the course for the 
degree in Education. 

Of the institutions reporting, 46 have the course as prerequisite to 
other courses in Education. 

Sixteen of the institutions offer the course in the Freshman year, 27 
in the Sophomore year, 24 in the Junior year and 3 in the Senior year. 

The most commonly used texts are (numbers following the titles 
indicate the number of institutions reporting that text): Starch, 
Educational Psychology, 25; Gates, Psychology for Students of Educa- 
tion, 18; Strong, Introductory Psychology for Teachers, 13; Pyle, The 
Psychology of Learning, 12; Woodworth, Psychology, 12. Also quite 
commonly used are: Freeman, How Children Learn, 10; Seashore, 
Introduction to Psychology, 9; Thorndike, The Psychology of Learning, 9; 
Terman, The Measurement of Intelligence, 9. 

From these data Douglas concludes: (1) That the divorce between 
General Psychology and Educational Psychology is not yet complete— 





1This article constitutes part of a research carried on at the Ohio State 
University. 

2 The Present Status of the Introductory Course in Educational Psychology 
in American Institutiédns of Learning. Journal of Educational Psychology, Vol. 
XVI, September, 1925, pp. 396-408. 
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implying a tendency toward a separation of these departments. 
(2) That Educational Psychology is generally recognized as basic to 
further work in professional training and to teaching. (3) That there 
is coming to be an agreement as to the time at which the course should 
be given. (4) That while many texts are in use, there is decided 
agreement as to texts at least when the first course in Educational 
Psychology is given in the Sophomore or Junior year—and by implica- 
tion, that there is equivalent agreement as to the desirable content of 
the course. 

The evidence for (2) is quite clear. The course in Educational 
Psychology is coming to be widely recognized as basic to teaching 
and to further courses in the department of Education. To this may 
be added the finding of O’Brien’ that alumni rate this course as the 
most valuable one in their professional training. Concerning the other 
conclusions, however, especially (4), there is more to be said. 


DEVELOPMENT OF THE COURSE IN EDUCATIONAL PsYCHOLOGY 


The writer has gone over in some detail the early catalogs of 11 
of the teachers colleges and universities first giving the course (so 
far as he can make out, the first) and finds the proportion of cases in 
which the course was originally given in the department of Education 
(or Pedagogics, Didactics, etc.) to be almost’ exactly the same as it is 
at the present time, 7.e.,3 to 1. There seems, then, to be no general 
tendency toward either divorce or unification of the departments, at 
least so far as it is revealed by the college catalogs. 


THe TEMPORAL POSITION OF THE COURSE IN EDUCATIONAL 
PsyCHOLOGY 


It is difficult to see from Douglas’ figures that there really exists a 
central tendency as to the time at which the course should be given. 
Since the course is so commonly prerequisite to further professional 
courses, the possible years for beginning it are practically limited to 
the first three. The distribution, Freshman, 16; Sophomore, 27; 
Junior, 24 could conceivably occur by pure chance. Again, the dis- 
tinction between the Sophomore and Junior years is between the 
underclass and the upperclass divisions and this distinction between 
the Junior and the Senior college is becoming more and more empha- 





1 O’Brien, F. P.: Employing Student Criticism in Revising Courses in Educa- 
tion. Educational Administration and Supervision, Vol. XI, September, 1925, 
pp. 394-398. 
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sized. There really appears, then, concerning the placing of the course, 
almost as wide as possible a disparity of opinion. Of course as long 
as some states grant teaching certificates after two years of college 
work and also require Educational Psychology in that two years’ 
course, the issue is forced. The real merits of the question should, 
however, have the careful consideration of educational psychologists. 

After all, the lack of unanimity of opinion as to the department 
in which the course in Educational Psychology shall be given or as to 
the year in which the study shall be started is perhaps not soimportant. 
The real question is undoubtedly: ‘‘What shall the contents of the 
course be?” Certainly it would seem that a course required by three- 
fourths of the departments for a degree and by the same proportion 
as a@ prerequisite to other courses in Education should be a course 
which contains a relatively large amount of verified and agreed upon 
material. It is quite safe to assume that elementary courses in Physics 
or Chemistry have, despite varieties of treatment, a high degree of 
community of content. Is this true of Educational Psychology? 
Does a state department which accepts from a standard, class A uni- 
versity, credits for the course really have any definite idea as to the 
material with which the course dealt? Let us look for answers to 
these questions by examining some of the textbooks and courses of 
study in use. 


AGREEMENT AMONG TEXTBOOKS 


Douglas points out that there is decided agreement as to texts 
when the first course in Educational Psychology is given in the Sopho- 
more year or Junior year. The texts most commonly used in these 
years are those by Starch, Gates, Strong, Pyle and Seashore. Douglas 
says further ‘Since the great majority of colleges and universities 
insist on giving this course in the Sophomore or Junior years, the 
prospective author has a rather definite level of mental ability and 
experience! by which the subject-matter of his book may be gauged. 
This has probably led to a more uniform content in those books of 
recent publication.” But can we assume from the fact that a few 
books are commonly used, that there is substantial agreement as to 
their content? Starch’s text heads the list of those used in the Sopho- 
more and Junior years with Gates’ next. Gates gives 12 per cent of 
his whole text to a discussion of the receiving, connecting and reacting 





1 The writer has been collecting data which show that here, too, there is a 
surprising variation. 
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mechanisms and gives 4 per cent more to a description of conscious 
states and processes. Starch gives 0 per cent to these topics. On 
the other hand, Starch uses about 40 per cent of his space for the 
psychology of and tests in the school subjects, as reading, writing, etc. 
Gates does not touch upon these matters. Indeed, judging from the 
amount of space given to definite topics, Gates and Starch agree to 
the extent of only about 33 per cent. But their real agreement is 
less than that. For example, Gates, following Thorndike, treats the 
subject of instinct as one of maximum importance. Starch devotes a 
chapter to the subject most of which is given over to an exposition of 
the inadequacy of educational doctrines based upon instinct, and 
doesn’t mention the subject again. Both authors take up transfer 
of training; Starch devotes 14 per cent of hie text to the topic and uses 
around 30 tables and 3 graphs. Gates gives 4 per cent of his book to 
the topic and uses one table and no graphs. 

Strong’s text, only about half as long as that of Starch or Gates, 
and constructed to a large extent as a laboratory manual, is so diver- 
gent from the others as to resist adequate comparison. From the 
proportionate amount of space given to the particular topics treated 
this text would seem to agree to approximately 50 per cent with 
Gates and 30 per cent with Starch. The make-up and laboratory 
method of the text, however, reduces this apparent agreement mark- 
edly. Strong almost entirely neglects transfer of training but gives 
about 35 per cent of his space to individual differences. Gates and 
Starch each use about 5 per cent of their books for this subject. 

The Psychology of Learning by Pyle represents still another distinct 
type of text. His agreement with Gates is about 50 per cent although 
he does not discuss the nervous system, while with Starch he agrees 
to about 40 per cent but does not take up the psychology of the special 
school subjects. With Strong the agreement, too, seems to be around 
40 per cent. But it should again be noted that the agreement here 
noted refers merely to the topics treated and the proportionate amount 
of space given tothem. Seashore’s text, the next most commonly used 
in the Sophomore and Junior years, does not even purport to be an 
educational psychology and so represents a fifth distinct type of text. 

In fact one would be somewhat put to it to discover five texts on 
supposedly the same subject which vary more than do these. And if 
one should search for one more to vary as much as possible from any of 
these he could probably do no better than to chose Terman’s The 
Measurement of Intelligence, used as the introductory text in nine of 
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the institutions reported by Douglas. The fact that the books 
mentioned are those most commonly used as texts might be 
interpreted, then, not as a concensus but a chaos of opinion.' 


A COMPARISON OF CouRSES OF STUDY 


But perhaps the courses as actually given do not vary as much as 
do the textbooks. This would hardly be expected to be the case as 
the writers of texts presumably give courses but, nevertheless, it might 
be worth while to examine some courses of study. The writer has 
received the outlines of the first course in Educational Psychology 
from 10 colleges and universities? A certain university sent two 
outlines, one of the course as taught and one as they would like it to 
be taught, so 11 were available for comparison. While some of the 
reports describe the course in detail giving the amount of time spent 
on each topic etc., some give only a list of topics included so the com- 
parison must be largely upon these lists. But this is enough to be 
enlightening. 

To begin with, the fact that one of the institutions follows rather 
closely Starch, one Gates and one Bolton is evidence of the diversity of 
teaching. As pointed out above, Starch and Gates agree to not more 
than 33 per cent. Bolton does not agree with either of the others to 
more than 35 per cent. 





1The writer has examined rather intensively 12 texts in Educational Psy- 
chology including, in addition to those of Starch, Gates, Strong and Pyle, dis- 
cussed above; Bolton, Everyday Psychology for Teachers; Thorndike, Educational 
Psychology (briefer course); Pillsbury, Education as the Psychologist Sees It; Cam- 
eron, Psychology and the School; Averill, Elements of Educational Psychology; Colvin, 
The Learning Process; Betts, The Mind and Its Education; Gordon, Educational 
Psychology. The only topic discussed by all of them under the same name is 
instinct, and the treatment of this topic varies as indicated above, from one-half 
of 1 per cent to 14 per cent of the whole text. Learning, in some phase or other is 
discussed by all, sometimes as learning, by others under memory, the retention of 
experience, association, habit, the art of study, reasoning and problem solving, 
the acquisition of percepts, ideas, etc. The treatment varies from a few rules of 
association and memory to practically the whole book. Eleven of the texts deal 
with reasoning—or thinking or associative learning, etc. Transfer of training is 
mentioned by 11 of the 12 and is discussed by 10 of them—varying in amount 
from 3 per cent of the text to 14 per cent. Nine deal with the nature of indi- 
vidual differences, including the statistical determination and measurement of 
them. And so on. 

2 Outlines were received from the universities of Michigan, Chicago, Wisconsin, 
Georgia, Iowa, Washington (State), Ohio (State); from Stanford and Cornell 
universities and from the Oklahoma Women’s College. 
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Some of the universities, e.g., Wisconsin, conduct regular labora- 
tory sections in connection with the course. Most of them do not. 
From the time schedules of the two or three institutions which sent 
them, it is noted that to the topic “The material with which we have 
to work” (including individual differences, measurement of intelli- 
gence, correlation of traits, etc.) one gives four weeks’ time and another 
six weeks. To learning one gives eight weeks and one two weeks, a 
ratio of four to one. The time given to the psychology of the school 
subjects varies as does the space in textbooks, from none to a third of 
the course. The transfer of training is covered in two lectures in one 
course but requires two weeks in another. 

All of the courses touch upon learning in some at least of its phases; 
all, for example (with one possible exception) take up the learning 
curve. All discuss transfer of training; ten treat of instinct; nine of 
individual differences; eight with thinking—of rational learning; 
seven with associative learning—memory or conditioned reflex; 
seven with perception or ideas; five with incentives to learning; five with 
measurement of capacities—traits or intelligence; five with the 
subject of heredity in general; four with educational tests; three with 
the psychology of the school subjects and so on. 

Some of the topics mentioned in but one outline are: The child’s 
personality; the child’s physical nature, growth, etc. (one other has 
“The Hygienic Basis of Learning’’); vocational psychology; imitation; 
imagination; harmful traditions and superstitions in education; the 
psychology of moral education; social psychology and education. 
It is evident that the diversity in the courses is as great as the diversity 
in the texts. 


CONCLUSIONS 


Douglas and O’Brien have shown that the course in Educational 
Psychology is generally considered to be basic to the professional train- 
ing of the teacher and considered by the teacher to be of great intrinsic 
value. Such agreement as to the importance of the course would lead 
one to expect that the course would have established a recognized 
position in the sequence of courses and that it would exhibit a consider- 
able amount of uniformity of content and of method of presentation. 
The above study shows that such is not the case. The department in 
which the course shall be given seems to have been largely fortuitous. 
Educational psychologists are unable to agree as to the college year 
in which the course shall be begun and (what is by far more important) 
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they in no way agree as to what material the course shall include 
The present condition of the course, taken at large, seems almost 
chaotic, and a vigorous attack upon the problem by educational 
psychologists appears imperative. 

To this end it is possible that instead of looking first to the content 
of General Psychology, it may be wise to examine carefully the teach- 
ing situation, to discover just what are the common, actual teaching 
problems and then look to psychology to see what aid it can give to- 
wards their solution. The writer is at present working on the problem 
from this point of view.’ 





1 His method of approach is more fully indicated in an article more specifically 
on the subject: Teachers’ Problems and Courses in Educational Psychology. 
Educational Administration and Supervision, Vol. XI, November, 1925, pp. 550- 
555. 
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IMPROVING THE VALIDITY OF COLLEGIATE 
ACHIEVEMENT TESTS ! 


ELEANOR PERRY WOOD 
Teachers College 


Purposes.—The main purposes of this study are: 

1. To find which of three types of objective tests of achievement 
in a college course in a social science is the most valid when each type 
of test is allowed one full hour. 

2. To discover by empirical methods the best scoring formula 
for each of these three forms of tests. 

3. To find the relative rates of growth in validity of the three forms 
of tests with increase in time-allowance. 

Background.—The general background of this project may be 
found in a series of articles entitled ‘“‘Studies of Achievement Tests”’ 
by Ben D. Wood, which appeared in the January, February and April, 
1926, issues of the Journal of Educational Psychology. In these articles 
(q.v.) evidence is presented to show that ‘‘ Number Right”’ scores are 
more reliable than “‘ Right minus Wrong’”’ scores on ‘‘ Do Not Guess” 
true-false tests (thus confirming the earlier findings of Professor Ruch? 
and Dr. Stoddard); but he finds that “‘ Right minus Wrong” scores on 
true-false tests of 50 or more items yield higher correlations with 
acceptable criteria than the “Number Right” scores yield. This 
apparent “‘opposition” between Reliability and Validity suggested 
that it might be worth while to make direct studies of the relative 
validities of the three most commonly used forms of objective ques- 
tions, and of various methods. of scoring each of these three forms, 
given equal time-allowances. 

Construction and Administration of the Tests.—The final examina- 
tion in Government I in Columbia College in January, 1926, consisted 
of four one-hour tests, as follows: 





1 This study was made possible by a grant from the Commonwealth Fund, and 
by the cooperation of, and financial assistance from, the Department of Govern- 
ment in Columbia College. The credit for the experiment belongs in the first place 
to Professor Arthur Macmahon and Mr. Joseph McGoldrick, of the Department of 
Government in Columbia University. I thank them for generously permitting 
me to use a part of the data of their experiment in this study. Hearty thanks are 
also tendered to Professor W. A. McCall, whose guidance is mainly responsible for 
whatever value this study may have. 

2 Ruch, G. M.: “Improvement of the Written Examination.” Scott Foresman 
& Co., 1925. 
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(A) A true-false “‘Do Not Guess” test of 210 items. 

(B) A multiple-choice, 5-response “‘Do Not Guess” test of 159 
items. 

(C) A recall (semi-trade, Completion form) test of 227 items (stu- 
dents strongly advised not to guess). 

(D) An old type essay examination (given two days before Tests 
A, B and C). 

No effort was made to have these tests include precisely the same 
items of information or judgment. Each test was made up independ- 
ently of the others, the objective in each case being to make the best 
one-hour test of achievement in Government I that could be made with 
each of the three forms of questions. To limit each form of test by 
attempting to make all three cover precisely the same points might 
have resulted in undue distortion of some of the unique characteris- 
tics, if any, of one or more of the three forms of test. Professor Mac- 
mahon and Mr. McGoldrick, who constructed and administered all 
the tests used in this study, had had considerable experience in test- 
making; they had used true-false, multiple choice and completion tests 
as regular parts of all final examinations in Government I for four 
years, and in certain other courses also. It may therefore be assumed 
that they have approximately equal skill in making each of these 
three forms of questions, and that the comparison here made between 
the validities of these three one-hour tests is as fair to one form as to 
another. The questions in all three tests incline to be problematical 
rather than informational, many of them involving genuine reasoning 
opportunities. 

The three tests were purposely made long so as to insure that each 
student would spend a full hour on each kind of test question. As far 
as the examiner in charge could observe, every student spent a full 
hour in each of the three tests. They were given in rotation to equalize 
practice effects, the 150 men in the class\being divided into three 
equal groups by random selection for this: purpose. The students 
were instructed to mark all the easy questions first, and were strictly 
enjoined not to guess on the true-false and multiple choice tests, and 
were strongly advised not to guess on the completion. The students 
knew that the true-false test would be scored 7 — 2W — O(= R — 
W), and that the multiple choice test might be similarly scored, and 
they knew that the Number Right would be the score on the completion 
test. Since all of the students involved had taken objective tests of 
this sort several times before and were already thoroughly imbued with 
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the idea that guessing should be avoided, it is believed that there was 
very little sheer guessing and that this was almost wholly confined to 
the completion test. The students were mainly from the Freshman 
class, but the group included some Sophomores and a few Juniors. 

The Criterion.—The criterion for achievement in the course was 
the average of seven separate measures; a 50-minute true-false quiz 
given five weeks after the course began; a 50-minute completion quiz 
given nine weeks after the course began; the instructor’s estimate 
three days before the end of the course; a 50-minute old type essay 
‘examination given at the last meeting of the course, just two days 
before the final examination was given; and the three parts of the 
final examination described above. As included in the criterion, 
the latter were scored in the usual manner, Right-minus-Wrong for the 
true-false, and Number Right for the multiple choice and completion 
tests. Each of these seven measures had equal weight in the average, 
that is, in the criterion. 

The Data.—The number of students in Government I was 150. 
Three sets of papers were discarded because of incomplete parts, leaving 
147 sets of tests that were used throughout this experiment. 

Method and Results.—The first step in the procedure was to re-score 
the 147 sets of papers, using as scores: 1, the number of right responses; 


TaBLE I.—MeEans AND STANDARD Deviations or Scores on Tests A, B and C 





Rights | Wrongs | Omissions| R-—-W |R—- WwW 





Gets cee ea ee 8 M | 120.3 44.2 49.7 76.7 
True-false.......... rs 25.8 14.8 31.6 27.8 
| ier Fe 43.7 41.6 41.6 75.2 
Multiple choice...... wi 21.6 14.6 24.3 26.9 22.3 
| Te Fe FO 57.0 87 .2 37 .6 
Completion......... .. | 380.8 19.7 36.5 37 .5 























2, the number of wrong responses; and 3, the number of omitted items. 
Right-minus-Wrong scores were found for each of the three tests, and 
R — (W[N — 1]) scores for the 5-response test. The mean scores 
and sigmas of the distributions of these scores are presented in Table I. 

The first line of r’s in Table II shows that in terms of Number Right 
scores, the completion test is most valid and the true-false a poor 
third. Applying the chance correction to the true-false and 5-response 
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TaBLE II.—INTERCORRELATIONS OF CRITERION, RicHts, WRoNnGs, OmIssIONs, 
R — W anv R — \{W Scores, anp MuttipLe CoRRELATIONS FoR Eacu 
OF THE THREE OsJecTIVE TESTS 








Test A: 213 | Test B: 159 | Test C: 227 

true-false 5-response | completion 
items items items 
Correlation of criterion with rights.... . .748 .850 .880 
ncn ¢d-<wag turks FAN ees tn «te — .264 — .247 — .085 
I Soin. on ahead eee eek coe — .480 — .615 — .730 
Se ass cet bes tw ede ecasds .845 .801 .801 

ES ee ee: .860 

Correlation of rights with wrongs..... .145 , —.146 — .052 
ie ae Dee ee dag — .864 — .792 — .840 
Correlation of wrongs with omissions. . — .607 — .459 — .459 
6 die ck nme seved geet bane ee. .847 .861 .890 














tests, S = R — (W[n — 1]), does not change the order, but it does 
practically equalize the validity coefficients of the true-false and 5- 
response tests (0.845 and 0.86), and leaves the true-false only 0.035 
behind the completion test. This means that the difference in favor 
of the completion test is roughly three times the PE of the validity 
coefficient of the latter. If the conditions of this experiment are at all 
representative of general testing situations, the near equality of true- 
false and completion tests here indicated is important. 

On the completion and 5-response tests the R — W scores are 
considerably less valid than Number Right scores. Correcting for 
chance in the 5-response test raises its validity coefficient from 0.85 
to 0.86—a negligible difference. 

The indications are that of the three forms of tests here used only 
the true-false should be corrected for chance. Correction for chance 
in the 5-response test may well be left to the whim of the individual 
examiner, but the increase in validity of the true-false test is so con- 
siderable that the use of R — W instead of Number Right scores seems 
almost obligatory on users of ‘‘Do Not Guess’’ true-false tests. This 
conclusion seems to be consistent with all the available evidence. 
Ruch and Degraff! show that the average increase in validity coeffi- 





1 Journal Educational Psychology, Sept., 1926, pp. 368-375, see especially 
Table II, p. 371. See also Journal Educational Psychology, Jan., 1926, pp. 13-14. 
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cients effected by ‘‘corrections for chance’”’ are about as follows for 
“Do Not Guess”’ tests: 


ee ae 0.022 
ES fc! cn o> ss CRM UE ied v8 8b hac ode wk‘e 0.060 
(3) 2-response and true-false....................... 0.087 


An increase of nearly 0.09 in validity coefficients already in the 
neighborhood of 0.80 seems sufficient to justify the rule that ‘“‘Do Not 
Guess” true-false tests should be corrected for chance, i.e., scored 
R— W. 

Multiple correlations based on the intercorrelations of Table II, 
and calculated with the aid of Symonds’ four-variable charts, lend 
positive support to the suggestions of the last paragraph: 








| Multiple correla- | Validity coeffi- 
tion cients from 
Y..RWO Table IT 
II, 5 bk co's 60.08 s0ks wake 0.847 —0.748(r-r) 
, —0.845(r.(r-w)) 
I, oc csccscevesesuedaen 0.860 0.850(r-2) 
. ge rey er 0.890 0.880(rer) 











The indication of these figures is clear that there is no possible 
combination of Rights, Wrongs and Omissions which will give signifi- 
cantly higher validity coefficients than Number Right scores on the 
completion and 5-response tests and  — W scores on the true-false test. 

Relation of Validity to Length of Test.—The results presented in 
Chart 1 were arrived at by finding the average correlation of one page 
of the test with the criterion, the average correlation of two pages 
with the criterion, etc., and turning the average correlation values into 
terms of K =1-—~vr?. The indications of the chart are that the 
growth in validity in terms of K, from a 30-minute test to a 60-minute 
test is just about as fast as the growth from a 15-minute to a 30-minute 
test. This applies to all three of the new type tests used in this 
experiment. The chart also indicates that the three forms of tests 
are nearly equally valid throughout. If we may assume that the 
results of this experiment are reliable and applicable to general testing 
situations, then the apparently common opinion that from 100 to 
200 items “‘practically exhaust” the validity potentialities of true- 
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false, multiple-choice and completion tests is disproved, and the com- 
mon practice of using such short tests in situations which are likely to 
affect materially the life-careers of large numbers of students, as at 
entering or leaving collegiate and professional schools, should be aban- 
doned. In two or three colleges and professional schools the new type 
parts of the final examinations in a number of basic courses now regu- 
larly include from 250 to 500 questions. In at least one instance 
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Cuart 1.—Rates of growth in validity of true-false. 5-response and completion tests 
in terms of K = Vl — r*. 


known to the writer a final four-hour examination in a professional 
school included 794 separate objectively scored questions! 

In passing it may be noted that, according to Chart 1, the 50-min- 
ute essay examination is less valid than 15 minutes of any one of the 
three objective tests. In terms of K, the least valid of the 60-minute 
objective tests is about three times as valid as the 50-minute essay 
examination. It may be added that this comparison, granting the 
validity of the criterion used, is very fair to the essay examination, 
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because the latter was given under extremely favorable conditions and 
was scored with great care. 

Experimentation with varied and new forms of questions gives 
promise of producing great improvements in our educational measuring 
devices; but it appears certain that we have failed to exploit fully the 
resources of such well-known veterans as the true-false, multiple-choice 
and completion forms of questions. Apparently we have overlooked 
most ready-to-hand method available for increasing the validity of our 
tests, namely, the simple device of increasing their lengths. As Pro- 
fessor Toops has acutely remarked, this simple method is patently 
worth trying on an extensive scale when short or moderately long tests 
afford low reliability coefficients. 

If we have been guilty of such an oversight, it is probably due 
primarily to two fundamental errors: (a) an over-estimation of the 
importance and a cloudy interpretation of the meaning of reliability 
coefficients, and (b) an under-estimation of the importance of sampling 
of materials and performances in our tests. Improvements in the 
external forms of test questions may increase the “effective” validity 
of a test, by eliminating the masking effects of chance and irrelevant 
systematic influences; but such improvements can hardly increase the 
‘“‘effective’’ validity beyond the “latent” validity of a test. The 
‘“‘latent’”’ validity depends primarily upon the extent to which the test 
represents the universe of facts, skills, or thoughts which it is intended 
to represent, and no other. Thus, by refining the scoring method and 
by improving the form of questions in a test whose latent validity is 
0.80, we might raise its ‘‘effective’’ validity from 0.60 to 0.70 or 0.75; 
and, of course, this would be a priceless improvement. In this experi- 
ment we raised the validity of the true-false test from 0.748 to 0.845 by 
using R — W instead of Number Right scores. But we cannot squeeze 
more water out of a sponge than it contains, although by increasing the 
pressure we might indefinitely increase the fraction of its total content 
obtained. Let us try to increase the amount of validity that might 
be extracted, without diminishing our efforts to salvage larger and 
larger proportions of whatever validity is inherent in our tests. 

Theoretically, the argument here is against a straw-man; practi- 
cally, it is a plea for wider use of a simple device which thus far has 
been as much neglected in educational practice and experimentation 
as it has been accepted in theory by experts. From the standpoint of 
immediate educational administration guidance of students, it may be 
said that the number and character (or internal constitution) of ques- 
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tions, z.e., the size and quality of the sampling of materials and per- 
formances included in a test, are more important than the external form 
of the questions. What form of test we use may well be left to expe- 
diency, remembering only the generally accepted rule that we should 
use as many forms as the subject-matter requires and as administrative 
convenience will allow—usually from two to five forms. The important 
thing is to make our individual questions more carefully and to make our 
tests include larger numbers of such questions. 











RELIABILITY OF THE py WILL-TEMPERAMENT 
S 


JUNE E. DOWNEY AND RICHARD 8S. UHRBROCK 
University of Wyoming 


It has been assumed by most critics of the group will-tempera- 
ment tests that it is impossible to repeat them because of the nature 
of the tests. The “‘test-wise”’ subject, it is asserted, behaves very 
differently from the naive subject. In the absence of results obtained 
by actual repetition of the tests, their reliability has remained in doubt, 
or has been assumed to be low. 

It seemed wise, therefore, when an opportunity came to carry on 
further investigation of the tests through a grant from the National 
Research Council by its Committee on Migrations Research, to 
begin at this point and to obtain reliability coefficients for every phase 
of the tests. Various methods of scoring the test items have been 
investigated. The results are reported at this time. 

It is not possible here to enter into a description of the group will- 
temperament test.! It is sufficient to say that a score is obtained for 
each of the 12 following traits: Speed of movement, freedom from load, 
flexibility, speed of decision, motor impulsion, self-confidence, non- 
compliance, finality of judgment, motor inhibition, interest in detail, 
coordination of impulses and volitional preservation. The scores for 
8 of the 12 traits are obtained from various writing exercises with 
instructions designed to induce various mental sets such as tendency 
to speed movement, to slow down movement, to disguise handwriting, 
to continue under distraction, etc. The other four traits are scored 
on a somewhat different basis as follows: Speed of decision is scored by 
the number of words underlined in a decision test; self-confidence 
by doubly underlining certain decisions; non-compliance by the extent 
to which the subject acts on suggestions made by the experimenter; 
and finality of judgment by the time spent in reconsidering decisions 
once made. 

The four tests just mentioned are those in which being “ test-wise”’ 
will have most significance. In the speed of decision test, the subject 
gives his opinion as to his own character traits. Because of the opera- 
tion of memory, the conditions may well be very different when he 





1 Downey, June E.: “The Will-temperament and Its Testing.” World Book 
Co., Yonkers, 1923. See Manual of Directions for Downey Group Will-tempera- 
ment Test, also published by the World Book Company, 1923. 
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attempts to do this a second time. The same statement holds true 
for finality of judgment. Quite obviously, familiarity with the 
material used and memory of a former reaction decreases the possi- 
bility of consistent rechecking. When a self-confidence test is repeated 
the conditions may alter, due to memory and sharpened attention. 
The amount of change, presumably, is not equal for all subjects. Non- 
compliance is a trait tested by a procedure which involves false sug- 
gestions made by the examiner. Repetition of the test would seem 
obviously of little value, at least with subjects of average or high 
intelligence who might be expected to see through the technique used. 
This is also a test where discussion, coaching, or even casual remarks 
may seriously influence the results. One would not anticipate for 
these four tests, particularly the two last named, high reliability 
coefficients. The second test is in many respects a different test from 
the first. Alternate tests need to be devised to get an indication of the 
reliability of the methods used. 

The tests based upon writing might be expected to have higher 
reliability than the four just cited. Even if acquainted with the 
test procedure, the subject might be unable to modify his reactions to 
any marked degree. The difficulty in the writing tests is in establish- 
ing the precise mental set desired. 

Three groups were given the will-temperament tests twice, on 
successive days, to determine the reliability of the separate items. 
Since sex and age were recognized as possibly influencing the outcome 
of the tests, different groups were chosen as follows: 42 high school 
boys (median age 15); 37 high school girls (median age 14) ; 149 Normal 
College women (modal age 19). Table I shows the distributions of 
ages for the three groups of subjects. 


TasLe I1.—Distrmvutions or AGES 
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Since the published norms are stated to be only tentative and were 
established on responses obtained from adults, coefficients of reliability 
for these three groups of subjects were calculated on the raw scores. 
Various possibilities in methods of scoring were utilized. In all, some 
63 forms were tried out. 





> age 








28 The Journal of Educational Psychology 


Particular attention was given to the possibility of obtaining signif- 
icant data from various types of ratios. One ratio, namely that 
obtained by dividing speeded writing by usual writing, is used in the 
will-temperament test under the name “freedom from load.” The 
other possibilities investigated include the ratio between speeded 
and retarded writing, usual and retarded writing, and various ratios 
obtained from millimeter measurements. Briefly, it may be stated 
that none of these ratios proved to have a high reliability coefficient 
and yet the possibility of using such ratios in temperament 
testing is highly intriguing. Because of the low values obtained 
from these ratios only the most promising are quoted. Others may 
be obtained from the authors by any investigators who are interested. 

In tabulating reliability coefficients the three groups of subjects 
described above will be kept separate. Except for the four tests 
previously grouped together, the tests will be discussed separately. 

Table II gives the reliability coefficients, the means and the stand- 
ard deviations for all the tests. It was expected that the reliability 
coefficients for speed of decision, self-confidence, non-compliance and 
finality of judgment would be low. There are indications, however, 
that changes in technique might increase the reliability. 

The reliability coefficients ranging from +.62 to +.64 for speed of 
decision are distinctly promising. Quite possibly they might be raised 
by considerably increasing the number of items to be checked, and 
lengthening the time of the test. In a non-verbal will-temperament 
test! the number of items for speed of decision was increased from 30 
to 36. A comparison of the results of the two forms is, however, some- 
what ambiguous. In the case of 49 junior high school girls, the reli- 
ability coefficient for speed of decision in the non-verbal form ran as 
high as +.84; but 90 junior high school boys gave a reliability coeffi- 
cient of only +.51. A still greater increase in the number of items and 
in the time spent on the test should be tried. 

The reliability coefficients for the self-confidence test range from 
+.09 to +.57. There is consistency in the results obtained for the 
junior high school girls and for the Normal College women. The 
reliability coefficients are +.57 and +.51, respectively. The failure 
to find any correlation for the results obtained from the junior 
high school boys, in contrast with the promising correlations in the 
case of the other two groups, suggests the possibility of the boys having 
talked about this part of the test in the interval between the first and 





1 To be reported elsewhere. 
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second giving. It was suggested in connection with the speed of 
decision test that an increase in the number of items used 
might increase the reliability of the test. In the non-verbal form of 
the self-confidence test the number of items was increased to 40. 
The coefficients of reliability for this form of the test is +.80 for 49 
junior high school girls and +.64 for 90 junior high school boys. 

In the case of the non-compliance test the coefficients of reliability 
range from +.31 to +.50. The standard deviations are large. It 
is evident that the repetition of the present form does not yield reliable 
results. Possibly the solution of this difficulty lies in the develop- 
ment of alternate forms of the test. 

The reliability coefficients for the finality of judgment test are not 
only low, but the standard deviations are so high as to indicate that 
the test as given and scored has noreliability whatever. The reliability 
coefficients range from +.30 to +.46. The raw scores from which 
the coefficients were obtained were in terms of seconds spent in recheck- 
ing the items previously marked in the speed of decision test. A 
more satisfactory method of scoring is obtained by scoring on the flat 
time and also on the number of changes made in checking. The 
difficulty in scoring this test lies, however, partly in the fact that those 
who originally make only a few decisions have much less material 
to work on in the rechecking than those who make a great many 
decisions. The test should be revised so that all subjects recheck the 
same number of items. 

Let us now turn to the other eight tests of the series, first consider- 
ing speed of movement (see Table II). In the published tests, speed of 
movement is determined by the letter count in writing the phrase 
“United States of America” at usual speed. In Table II coefficients 
of reliability are given for this method of scoring and also for speed 
of movement as measured by the total millimeters covered in writing 
the phrase one or more times and also the average millimeter measure 
for writing the phrase. The reliability coefficients range from +.56 
to +.83 and are fairly high except for the group of junior high school 
girls. It appears, however, from studying the means for the letter 
count, that in writing the phrase the first day, the boysand girls speeded 
rather than wrote at their usual rate. Quite possibly the mental set 
for speed which operates in taking the usual intelligence test carried 
over into this test. It is very strongly the opinion of the writers that 
this difficulty is encountered in giving the whole series of tests. If 
one may judge from the means for the two days, the high school 
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groups, both boys and girls, were in a higher state of tension the first 
day than the second. It will be noticed that when the writing 
of the phrase “‘ United States of America” was speeded, that is, when 
the subjects wrote as rapidly as possible rather than at their usual 
rates, the reliability coefficients are uniformly higher. They range 
from+.71 to +.88. This furnishes an argument for using the speeded 

rather than the usual rate as the item on which to score speed of 
' movement. : 

The writing of the name at usual speed, and also as rapidly as 
possible, was checked to determine the coefficients of reliability. In 
the printed form this material appears as Test II-1 and Test II-2. 
The ratio obtained by dividing the letter count for speeded writing by 
the letter count for normal writing is used as a measure of freedom from 
load. The reliability coefficients for writing the nameat usual speed 
range from +.32 to +.90.; for writing the name rapidly they run from 
+.74to +.90. Again, the low reliability coefficients are obtained from 
the records of the junior high school girls when normal speed was 
called for. 

The means and standard deviations make possible some curious 
comparisons so far as writing the name and phrase are concerned. 
The name normally is written more rapidly than the phrase and it is 
possible to achieve a very much greater speed in writing it than 
appears to be true in writing the phrase. Possibly the most satis- 
factory speed of movement item would be the letter count on the 
speeded writing of the name. 

In testing freedom from load a ratio was used, namely, the letter 
count obtained from speeded writing divided by the letter count for 
natural writing. Reliability coefficients are reported for both the 
writing of the phrase “United States of America” and the writing of 
the name. These reliability ceofficients are far from satisfactory. 
The results, however, are thoroughly consistent with those obtained 
for speed of movement. We have reason to believe that the junior 
high school boys and girls actually had a ‘‘speed-set”’ in their first 
writing of the phrase on the day of the first test, while the Normal Col- 
lege women followed directions more exactly. It will be noticed 
that in writing the phrase the reliability coefficient for the Normal 
College women is fairly high, namely +-.72. 

In writing the name the results are not quite the same since the 
natural speed of writing the name is greater than in writing the phrase. 
Apparently, the possibility of speeding on the name is greater than 
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speeding on the phrase. It is our impression that the ratio of speeded 
to usual or natural writing might better be obtained from writing the 
test phrase rather than the name, and that the low reliability coeffi- 
cients found in the case of the groups of boys and girls might be over- 
come by re-wording the directions for the test in such a way as to 
insure the desired mental sets. The means obtained for the first day 
of the test for the natural writing indicate very clearly that in the case 
of two groups the desired mental sets were not obtained. In order to 
determine whether this test may be made satisfactory, various groups 
tested by various examiners, and with variations in the wording of 
the instructions, will need to be tried. Dramatic suggestion of the 
mental sets desired should be attempted by the examiners. Attitude, 
gestures and voice can be utilized in such suggestion. 

It is interesting to compare the reliability coefficients when a ratio 
between speeded and normal writing is obtained with that we get from 
the ratio between speeded and retarded writing. Since there is reason 
to believe that speeded writing is in certain respects more reliable as a 
measure of speed of movement than usual writing, one might antici- 
pate a higher reliability for a ratio of speeded to retarded writing than 
for speeded to usual writing, except for the fact that the retarded writ- 
ing used in the test is not merely an expression of slow movement but 
represents an actual effort at inhibition. The reliability of this ratio 
might be improved by using milder instructions concerning retarda- 
tion of writing. 

In determining the reliability for coordination of impulses, Test 
V, raw scores were obtained for the number of letters written on the 
line during the limited time given for the test, a departure from the 
method of scoring suggested in the published manual. The reliability 
coefficients are low. A second method of scoring was attempted, 
using the total millimeter measure of the words written. In this 
case, the coefficients of reliability were slightly higher, but not satis- 
factory. It may be questioned whether the scoring methods utilized 
in any way measured the trait for which the test was designed. They 
were however, tried out in the interest of simplifying the procedure 
outlined in the manual of the group will-temperament test. 

The reliability coefficients obtained for motor inhibition, Test 
VII-3, using the scroll count for the third trial, range from +.43 to 
+.76. Just as in the case of the results obtained when the name was 
written slowly, it is evident that a first trial does not yield high reli- 
ability. A comparison of the means for the first and second days 
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suggests very definitely the need of preliminary practice in order to 
make this test reliable. 

Motor impulsion, Test X, gives fairly high coefficients of reliability 
when the scoring is done on the flat letter count, total millimeter meas- 
ure, or average millimeter measure. The range is from +.61 to +.86. 
It will be noted, however, that such methods of scoring make no attempt 
to bring together speed and size of writing. They score either one or 
the other. Such scoring methods are simpler than those suggested in 
the manual. Their value can only be determined by checking results 
against some criterion. What this criterion shall be we are not at 
present able to suggest. Perhaps the most curious fact revealed in con- 
nection with this test is the result in Test X-1. In this test the name 
was written at usual or natural speed, with the eyes open, in order 
that this material might be used in comparison with the writing of the 
name obtained under distraction. The curious result, to which we are 
referring, is the fact that the name under these conditions was written 
at a much slower rate of speed than is found in Test II-1. The means 
for the results obtained on the first day, for all three groups of subjects, 
indicate a percentage decrease in speed from 12 to 23 per cent. The 
decrease the second day was approximately 9 per cent. The test as 
given is preceded by the one entitled “interest in detail.” It is possible 
that the mental set of care induced by this test carried over into the 
one designed to measure ‘‘motor impulsion.”’ 

Investigators who are interested in the psychology of handwriting 
will find some interesting material reported for Test X with reference 
to the effect upon writing amplitude of the different conditions utilized 
in the experiment. The means found in scoring upon total millimeter 
measure and average millimeter measure should be studied carefully. 
On the whole, the withdrawal of visual control in Test X-2, where the 
eyes were closed in writing, resulted in a slight inhibition of the writing 
movement as has been reported previously by other investigators. 
Increase in amplitude of movement occurs when writing under ais- 
traction of attention, a circumstance which tends to increase automatic 
control. A comparison of the amplitude of writing when the writer is 
counting taps serially, or keeping track of the number of times a certain 
word is repeated by the examiner, indicates that the former method of 
distraction results on the whole in more automatic movement. 

There remain to be discussed the reliability coefficients for flexi- 
bility, interest in detail, and volitional perseveration. With reference 
to the first two traits it should be said that no effort was made to test 
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the method of scoring given in the published manual. No attempt was 
made to use qualitative aspects of the situation, that is, to score on the 
degree of success in imitation. The quantitative elements, as in the 
other tests, were limited to letter count and millimeter measures. The 
reliability coefficients for flexibility range from +.63 to +.78. Atten- 
tion should be called to the fact that the reliability coefficient obtained 
when millimeter measure was used can have little significance since a 
model was imitated, a circumstance which would tend to set very 
definite limits for the exercise. The value of such purely objective 
methods of scoring as those just cited will need to be determined by 
some criterion which checks the validity of the test as a measure of 
“ flexibility.” 

In scoring interest in detail objective measures were sought, just as 
for measuring flexibility. Letter count and millimeter measures again 
were used. As was true in the case of flexibility, the degree of success 
obtained in copying a given model was ignored since its use introduced 
a subjective factor. Two series of reliability coefficients are reported 
since the test on interest in detail involved the copying of the test- 
phrase written in two differentstyles. Thereliability coefficients, which 
range from +.31 to +.70 are lower than those obtained for flexibility. 
It is difficult to see a reason for this. 

It also is interesting to note that the reliability coefficients are more 
uniform in the case of the second phrase in which the instructions were 
changed from “Copy Model A just as exactly as possible. Speed doesn’t 
count. Work carefully, and make as good a copy asyoucan. If you 
should finish before the signal, be sure to begin a second time,”’ to ‘Under 
‘2’ copy Model B as well as you can, rapidly or slowly, as you prefer. 
Just choose your own speed. If you finish one copy, begin a second, 
and if you finish a second, begin a third time.”” The second model is 
a much more conventional and easily copied type ofhandwriting. This 
may account: for the difference. Whether these objective methods of 
scoring for ‘interest in detail’? which completely ignore the success or 
failure of the individual in achieving the imitation are valid methods of 
testing ‘‘interest in detail” as determined by some external criterion 
remains a problem for further experimental investigation. In the 
case of both flexibility and interest in detail, the standard deviations 
are high. This fact reduces the value of the reliability coefficients. 

It was the original intention in planning the tests for flexibility and 
for interest in detail to score largely on the difference in behavior of the 
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subject when reacting to contrasting mental sets. Therefore, the 
difference in letter count between the two tests was used. 

The last test to be discussed is that of volitional perseveration. 
In this test the subject attempts to achieve a disguised handwriting. 
The published norms are based on the total time in seconds spent on 
practicing such a disguise. The reliability coefficients are reported 
for two methods of scoring, namely letter count and time in seconds. 
The reliability coefficients range from +.25 to +.58. On the basis of 
the present methods of scoring, the value of this test is dubious. 
‘ It is interesting to summarize, within the limits of one table, the 
reliability coefficients which are + .80 or above. 


Tasie III.—Rewiasinity CorEFFICcIENTs .80 OR ABOVE, YIELDED BY THREE WILL- 
TEMPERAMENT TESTS 





Speed of | Speed of 
movement | movement 
Group and scoring method |-——-—~-——— . : 
Test | Test | Test | Test | Test | Test | Test | Test 
If-1 | I1-2 | VI-1 | VI-2| X-1 | X-2 | X-3 | X-4 


Motor impulsion 
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Summary and Conclusions.—It appears from the above discussion 
that many of the items of the Group will-temperament test are dis- 
tinctly promising so far as reliability is concerned. In a number of 
other tests a slight variation in technique would probably raise the 
coefficients to a satisfactory level. 

There operates in group will-temperament testing, not only a 
social factor, but also a definite mental set or attitude which the 
examiner must seek to control. Patterns of behavior, such as those 
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that have been established in school children through the taking of 
many tests, must be recognized as existing. A distinct difference in 
the group reaction of unsophisticated adults and test-wise high school 
boys and girls is evident to the examiner. 

Because of the fact that such group patterning of behavior takes 
place, with possible variations from group to group, it issuggested that 
each examiner consider his group as a unit, that he work out a per- 
centile distribution of the raw scores, and score by percentile rank 
rather than by published norms. 














RESULTS FROM THE TESTING OF A GROUP OF COL- 
LEGE FRESHMEN WITH THE DOWNEY GROUP 
| WILL-TEMPERAMENT TEST 


ARTHUR W. KORNHAUSER 


University of Chicago 


The Downey group will-temperament test was given to 111 
freshmen students in the School of Commerce and Administration of 
the University of Chicago in 1923. The names of the 12 parts of the 
test, the nature of each part, and the correlations with average first 
year marks, are given in Table I. 

Probably no great significance can be attached to any of these 
correlations since all are extremely low. The probable errors are all - 
about .06. Parts 9, 10 and 11 show the highest coefficients (.19, . 20, 
and .15 respectively), which suggest a possible relationship between 
these tests and the accuracy and industry which presumably play a 
part in academic achievement. Part 4 is the only other instance of a 
coefficient equal to those mentioned (—.15) and it can readily be con- 
ceived as pointing in the same direction as the others since it shows 
greater caution or deliberation to be slightly related to scholarship 
success. 

A number of correlation coefficients have been calculated between 
certain parts of the Downey test and the ratings of students in the 
traits of accuracy, industry and initiative. The ratings were given on 
a graphic rating scale, by instructors, fellow-students, and self. In 
the case of instructors and fellow-students, the rating used in the 
present calculation is always an average of at least three independent 
ratings. The combinations tried were chosen as among those which 
appeared most promising. The relations are given in Table II. 

The most striking fact about the results is the uniformly poor 
agreement. To some extent this may be due to the inaccuracy and 
unreliability of the ratings, but in the main it appears to reflect an 
absence of close relationship between the specified parts of the Downey 
test and the ordinary impressions of industry, accuracy and initiative. 
Only a few of the coefficients may possibly be significant and these are 
not consistent for all the ratings. The most plausible of the relations 
indicated is that of “volitional perseveration”’ with estimates of indus- 
try by fellow students and with self-estimates. It is possible, but not 
at all likely, that certain combinations of scores on the different parts 
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TaBLeE I.—CorrriciENts OF CORRELATION BETWEEN THE 12 Parts OF THE 
Downey Group WILL-TEMPERAMENT TEST AND First YEAR COLLEGE MARKS 





Parts of the test 


Nature of each part 


r 





10. 


11. 


12. 





. Speed of movement..... 


~ Seapets ee 


. Self-confidence.......... 


. Non-compliance........ 


. Finality of judgment... . 


. Motor inhibition........ 


Freedom from load...... 


Speed of decision........ 


Motor impulsion........ 


Interest in detail........ 


Coordination of impulses. 


Volitional perseveration. . 





Number of letters written in 20 seconds at 
normal speed. 

Ratio between number of letters written in 20 
seconds when told to write ‘‘just as rapidly 
as you can” and the number of letters 
written at normal speed. 

Excellence with which subject can disguise his 
writing and also speed and excellence with 
which he imitates a given sample of hand- 
writing. 

Number of self-estimates made in 45 seconds 
by checking in a list of characteristics those 
which apply to one’s self. 

Size of writing under distractions and amount 
written (as compared with normal). Con- 
ditions of distraction included writing with 
eye closed, writing and counting aloud at 
same time, writing and simultaneously keep- 
ing track of certain words read aloud by 
experimenter. 

List of statements is given, some of which are 
true and some of which are false. The sub- 
ject is scored by the number of statements 
concerning which he is “‘ absolutely sure.’’ 

Fewness of changes subject makes in his judg- 
ments of truth or falsity of statements, upon 
being told that some of his judgments are 
wrong and being given the suggestion that 
he change them. 

Shortness of time spent by subject in revising 
his original self-estimates. 

Slowness with which scrolls are traced when 
subject is told to go “just as slowly as you 
possibly can.” 

Accuracy and slowness with which sample of 
writing is copied when subject is told that he 
is to copy ‘‘as exactly as possible” and that 
“speed doesn’t count.” 

Subject is told to write a phrase on a short 
line as rapidly as he can but without running 
over. Score is fewness of letters omitted or 
extended over the line. 

Length of time spent in practice on the dis- 

guising of handwriting. 
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TaBie II.—Corrricients oF CORRELATION BETWEEN CERTAIN PARTS OF THE 
Downey Trst aND RatinGs or SEVERAL PERSONALITY TRAITS OF STUDENTS 
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of the test (‘profiles’) would show significant relationships which the 
separate scores do not indicate. 

In general, the conclusion from our present data is that the Downey 
group test fails to show any clear relationship to the abilities and 
characteristics represented in marks of college students and in ratings 
of several character traits. 








A LEARNING PLATEAU DUE TO CONFLICTING 
METHODS OF PRACTICE 


WILLIAM CLARK TROW AND RICHARD SEARS 


Yale University 


Many are the explanations that have been brought forward to 
account for the plateau of the learning curve, but few unequivocal 
experiments have been reported, chiefly because so many factors may 
influence it. The following experiment is presented as demonstrating 
with more than usual clarity the influence of the use of conflicting 
methods in the practice of a motor act in preventing the desired 
improvement. 

The difficulties in setting experimental conditions with a view to 
determining the effect of the choice of method are numerous. The 
best method may be hit upon by the subject at once; or whatever 
method is used, even a poor one, may be continued indefinitely. 
Then, too, a poor method in a somewhat complicated motor skill 
may retroactively inhibit the development of a better method; or, if 
different methods of performing the task are more or less successfully 
employed, they may be ill adapted to careful measurement. Further- 
more, subjects cannot be told to employ different methods: they should 
be taken up and discarded spontaneously to be of experimental inter- 
est, and only occasionally will this happen. 

The activity employed in this experiment was card sorting, and 
the procedure which was followed for the most part overcame the 
difficulties just referred to and produced surprisingly unequivocal 
results. The subject was a graduate student in the Department of 
Education at Yale. For each period of practice he was seated at a 
round table in approximately the same position, and dealt the usual 
pack of 52 cards into four piles as he would in playing. The same deck 
was used throughout the experimentation. The practice period 
occurred between 11 and 12 o’clock each night, because at this time 
the greatest regularity was possible. The only exceptions to this 
were those of the fifth and seventeenth periods, which were placed at 
7 P. M. instead. This alteration apparently had no effect on the 
quality of the performance. 

The first 10 daily practice periods were followed by an interval of 
24 days of no practice, after which the next 10 daily practice periods 
were followed by a second interval of no practice, this timé of 40 days. 
Then followed 9 practice periods, daily except for one omission. In 
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each separate practice period all the cards were dealt six times. Each 
of these dealings is called a trial. Thus in the total time of the experi- 
mentation, consisting of 29 practice periods, there were 174 trials, 
and 9048 cards were dealt. 

By means of an Eastman timer the subject recorded the time taken! 
to deal each pack. This method admits of a little inexactness due to 
the distraction element. However, these errors were very small and 
relatively constant. The first 7 practice periods involving 42 trials 
showed marked fluctuations but no improvement. This led the sub- 
ject to consider the advisability of experimenting on some other 
function, for his sole aim at this time was to plot a curve of learning. 
His report of these trials is illuminating: 

‘“‘Unsatisfied with each performance I shifted about from one 
method of dealing to another. I found that it made some difference 
where the pack was placed in relation to my thumb, 1.e., whether it 
was placed in the palm so that the thumb was placed over the middle 
of the pack, near the top of the pack, or near the bottom of the pack. 
The way in which the cards were pushed out by the thumb was also 
varied. At first the cards were pushed out with a semi-circular rota- 
tion, which placed the cards diagonally in relation to the pack. In 
this way either corner could be grasped in dealing. The nearer corner 
proved to be hard to hold firmly, and the farther corner required too 
long an arm movement.” 

This report clearly shows the nature of this period of selection and 
rejection of methods, some of which were better, some worse; but no 
one of them was adhered to long enough to produce improvement in 
the function. It was therefore suggested to the subject that he adopt 
some one method, whatever one seemed the best, and adhere to it for a 
time, whether it proved immediately advantageous or not. The 
method selected is described in the words of the subject as follows: 

“After the seventh day I used the following method exclusively: 
The cards were held so that the thumb was placed a little below the 
. middle of the pack. To push out the cards, the thumb was drawn back 
as far as possible, pressed firmly against the top card, and pushed 
straight outward, so that the edge of the card remained parallel with 
the rest of the pack. The movement of the thumb was accompanied 





1 After the first 10 practice periods, when it became clear that the experiment 
would be of greater value than was at first supposed, to satisfy the meticulous, the 
subject recorded the time of each sorting in fractions of asecond. The conclusions 
would have been the same without this refinement. 
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by a slight drawing in of the fingers which held the pack. This tended 
to prevent pushing out more than one card at once. Each card was 
thus pushed out so that an inch or an inch and a quarter of it pro- 
jected from the rest of the pack. Then the card was grasped in the 
right hand and laid on its proper pile on the table.”’ 

This adoption of a single method produced even more definite 
results than were anticipated. In the following table, the figures in 
Column I represent the average time in seconds taken for the six trials 
of each practice period. The mean deviations from this average 
are shown in Column II. As would be expected, these decreased as 
the skill improved. In Column III the average performance for each 


practice period is translated into the number of cards dealt per 
minute. 


IMPROVEMENT FROM PRACTICE IN DEALING CARDS 














I Il ia I il Ill 
Prac- | Average Number | Prac- | Average Number 
2 Mean . Mean 
tice number : of cards | tice number : of cards 
; devia- . devia- 
period of : dealt per | period of : dealt per 
tion ‘ tion . 
seconds minute seconds minute 
} | 
1 48 4 64.8 16 23.0 1.5 137.4 
2 41 3 76.2 17 22.1 1.9 141.0 
3 45 2 69 .6 18 21.2 1.1 147.0 
4 47 6 66 .6 19 22.8 1.4 136.8 
5 44 4 70.8 20 22.5 1.9 138 .6 
6 | . 39 3 79.8 21 30.4 0.8 102.6 
7 41 3 76.2 22 26.6 0.8 117.0 
8 36 5 86.4 23 23.8 2.5 130.8 
9 31 2 100.8 24 25.4 1.9 123.0 
10 29 4 107 .4 25 22.7 1.6 137.4 
11 27 .6 3.4 112.8 26 22.9 1.5 136.2 
12 24.2 1.6 129.0 27 25.2 2.8 123.6 
13 24:0 1.8 130.2 28 21.9 1.6 142.2 
14 23.2 1.4 134.4 29 24.1 0.5 129.0 
15 43.7 2.0 138 .0 


























The curve, drawn from the figures given in Column III of the table, 
shows graphically the stages in improvement. The performance began 
on a plateau, for the function is one which had been practiced intermit- 
tently before. In the seventh practice period it was clearly shown to 
be no better than in the second. The forty-second time the cards were 
dealt showed no improvement over the twelfth! 
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Then, beginning with the eighth practice period, the single method 
was chosen and adhered to. The result was at once apparent. In- 
stead of the variable results with no improvement, characteristic of the 
practice up to that time, a sudden change took place. Each practice 
period was far better than the preceding and an inspection of the time 
taken for each trial (the tables are not presented here) shows the same 
marked acceleration. 
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Improvement from practice in dealing cards. Practice periods—six sortings each. 
The dash lines indicate the longer time-intervals. The vertical line at the seventh 
practice period marks the beginning of the use of the single method. 






































No. cards dealt per minute 





If it were desired to inquire into the pedagogical implications of this 
experiment, they would point clearly to the advantages of instruction 
in cutting down the time required for learning. In this case, the 
implied function of the teacher is not to offer encouragement: moti- 
vation was high throughout the first eight periods. Instead, it is 
to point out best methods, that time may not be wasted in fruitless 
trial-and-error. 

The last section of the experiment, that after the twentieth practice 
period, was performed in order to get further evidence that the subject 
had reached his physiological limit. It seems probable that this is the 
case, at least so far as the method which was employed in dealing is 
concerned, though practice with another method might produce still 
greater speed. 
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The initial trials of this last section, those immediately after the 
40-day interval, showed unexpected results which, however, still 
further bear out the general conclusions of the experiment. Whereas 
the performance after the first interval of no practice was better than 
before it, that after the second interval was decidedly inferior, as may be 
clearly seen by the drop in the curve. In fact, it was slightly below the 
one which immediately followed the first interval of no practice. The 
difference was quickly made up, however, and the performance was 
soon practically at the level it had attained before the break occurred. 
The greater irregularity of this part of the curve, the subject believes, 
was due to “‘a less constant degree of interest and attentions’’; and 
perhaps in part to the omission of the practice period between the 
twenty-seventh and twenty-eighth periods. 

Whatever may be the causes for this drop, it is clear that they were 
not operating after the earlier period of no practice. It would seem 
that while such an interval may act to improve the function by allowing 
the less well-formed, undesirable connections to drop out, nevertheless 
its more probable influence is to impair the function by disuse. If this 
is true, the continued acceleration of the curve after the first break 
cannot be attributed to the period of no practice. Rather, it takes 
place in spite of it. Since the learning proceeds at approximately the 
same rate after the break that it did before, it is concluded that this 
continued progress was due to the same influence that produced the 
original acceleration, i.e., the adherence to the one method of dealing 
the cards, and further that the initial plateau was due to the conflicting 
methods which were at first employed. 











ON CORRECTIONS FOR CHANCE IN MULTIPLE- 
RESPONSE TESTS 


R. R. FOSTER AND G. M. RUCH 


University of California 


Introduction.—In the September, 1926, issue of this journal 
Ruch and Degraff published certain evidence in favor of instructions 
“not to guess’ combined with corrections for chance in multiple- 
choice tests. The present paper presents a number of multiple regres- 
‘sion equations which show the relative weights found for rights, 
wrongs, and omissions in estimating a selected criterion to be described 
later. The earlier paper cited explains the general outline of the 
investigation of which the present report is an outgrowth. For 
present purposes it is sufficient to state that 200 items of historical 
information were first given to about 2500 pupils in simple completion 
form. Of these numbers, 1977 are used in this paper. Since the 
chance element in completion tests is comparatively small, it is 
assumed that the scores earned by pupils on such a test represent fairly 
well their true knowledge about the items in question. The same items 
were also formulated successively into 5-response, 3-response, 2- 
response, and true-false items as nearly as it can be said that they 
remained the ‘‘same” under such treatment. There is at least a 
certain rough truth in assuming these items to be the same in all 
variate forms since the wording was retained exactly up to the point 
where the alternate choices were presented. The 1977 pupils were 
divided by chance into 8 smaller groups of from 221 to 281 each, as 
follows: 

5-response; do not guess instructions, 

5-response; guess instructions, 

3-response; do not guess instructions, 

3-response; guess instructions, 

Etc. for 2-response and true-false tests. 

Since all pupils took the completion variate of the test and one 
other form, it is possible to use the completion scores as the criterion or 
dependent variable and the rights, wrongs, and omissions as the 
independent variables. The subscripts have these meanings: 

1. Criterion (completion scores) 

2. Number right 

3. Number wrong 

4. Number omitted. 


48 





Corrections for Chance in Tests 


49 


The Results.—Tables I to IV present the calculations carried out. 


TasLe I.—CoRRELATION COEFFICIENTS OF THE ZERO ORDER 
Instructions to Guess 















































True-false 2-response 3- response 4-response 
Ni os fia .818 + .014 .901 + .008 .875 + .010 .905 + .007 
Ree as BANS — .769 + .017) — .894 + .009| —.757 + .018) —.701 + .022 
ekivein tc esata — .627 + .026| — .903 + .008| — .675 + .023) —.582 + .029 

Instructions Not to Guess 
oa terccccs .784 + .016 .768 + .016 .887 + .009 .897 + .007 
eisess tl tes — .228 + .039) —.381 + .034| — .405 + .037| — .364 + .035 
a — .510 + .030}) — .466 + .031| —.549 + .031| —.583 + .026 
ee — .261 + .038) — .038 + .040| —.107 + .044; —.117 + .040 
24. — .897 + .008) — .841 + .011; —.816 + .014) —.811 + .013 
Nat PER — .651 + .023) —.495 + .030) —.476 + .034| —.477 + .031 
TasLe II.—CoeErriciENts oF MULTIPLE CORRELATION 

True-false | 2-response | 3-response | 5-response 

SS Fc 04s n canes oes 882 .920 .903 .930 

Ri.234 (do not guess)............ .903 .846 .940 .935 

















TaBLe III].—REGRESSION EQUATIONS FOR THE Raw 


m 99 po 


> 9 fo 


True-false. .. 
PEN. ssavcesseces 
3-response......... 

§-response. . . 


True-false....... 
2-response........ 
3-response. . . 
5-response.... 


Instructions to Guess 


“6600s 606.094 68 X1 = .867X, = 
. X1 = .762X2 — 

Xi = .716X-¢ = 

abedce se 6 w 6's Siene Xi = .666X > —_ 


... X: = .785X2— .806X; — 

° Xi = .511X, = 1.114X;,; —_ 
peeaveetalsees X:1 = .632X2— .505X; — 
. X, = .887X2— .720X; — 


or Gross ScoRES 


.772X3 — 2.358 
.693X; — 11.553 
.384X; — 4.853 
.268X; + 7.439 


Instructions Not to Guess 


.059X, + 14.709 
.433X4 + 61.032 
.114X,4 + 19.916 
.409X4 + 85.362 





r Ss 5 
. 2, ee Bicone 
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TaBLE IV.—ReGrRessions EquaTION Usinae STANDARD MEasuRES! 


Instructions to Guess 


Di on dk. dé ove aie Daa ean Zi = .553Z — .422Z; 
Riad wes eove+s eee se sienenemewen Z1 = .508Z; — .485Z; 
NG 6.5 0% wnvn-s 0-00.00 da Cenbembunde Z: = .668Z, — .305Z; 
i so 6 0d.00.s06h0+0sneeaeane Z1 = .751Z2 — .263Z; 
Instructions Not to Guess 
ET Se ee Zi = .841Z, — .501Z; - .082Z, 
ES ee yt ore ee Z1 = .421Z2 — .557Z, — .401Z, 
3 3-response a ares eee ee re Se Zi= .738Z:¢ — .3890Z; —_ .149Z, 
Ec ccuk coast ceveeseuudeusuauas Zi = .397Z: — .573Z, — .541Z, 


Discussion of Results —It has often been remarked that fools can 
ask questions which wise men cannot answer. More frequently than 
is usually realized, in statistics, fools can compute what wise men can- 
not interpret. It is with somewhat of this feeling that any interpreta- 
tion of the foregoing data is attempted. In the first place it must be 
realized that multiple regression equations presuppose rectilinear 
regression. Inspection of the scatter diagrams and the calculation of 
correlation ratios for certain of these shows that curvilinearity of regres- 
sion is not uncommon although most of the coefficients reported fall 
inside the limits ordinarily allowable for treatment by rectilinear 
methods. Incidentally it may be stated that curvilinearity is more 
often found when the instructions are against guessing. This fact 
would tend to establish a strong probability that a multiple prediction 
equation based upon the use of curvilinear regressions would show an 
even greater superiority for instructions against too free recourse to 
sheer guessing than the paper cited for Ruch and Degraff suggested. 
Many of the correlation charts under instructions not to guess showed 
distributions of a decidedly triangular shape in contrast with the more 
familiar elliptical ones. 

There is probably no particular importance to be attached to the 
relative weights given rights, wrongs, and omissions in Table III since 
we are dealing with special test distributions which are not necessarily 
typical ones. On the other hand, Table IV may have some meaning 
since all scores are expressed in the same variability units. Two facts, 
at least, are evident from Table IV: 

1. The wrongs and omissions do supply additional information not 
inherent in the rights alone. In other words the prediction can be 





1 Z is defined as ss = =, 
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raised by the use of the wrongs and omissions (if any) in comparison 
with the zero order coefficients. 


2. The formula, Score = saente, where 7 is the number of alternate 





responses presented, would seem, on the whole, to over-penalize slightly. 

The multiple regression technique would not seem to be a very 
adequate one in determining the proper weighting of rights, wrongs, 
and omissions in multiple-response tests although it does suggest that 
there is something yet to be learned about the handling of such scores. 
It might be possible to generalize extended experience with such test 
scores by means of equations not based upon the assumption of 
rectilinearity of regression but even this does not seem entirely promis- 
ing. At any rate no harm can be done by accumulating more evidence 
like that given above, and it will at least prove fully as instructive as 
tossing coins or expanding the binomial as certain writers seem to 
think to be entirely adequate to the issue. 














THE PINTNER-CUNNINGHAM PRIMARY TEST 
RUDOLF PINTNER 


Teachers College, Columbia University 


The Pintner-Cunningham Primary Mental Test for Kindergarten 
and First Grade Children has been used very extensively since its 
publication in 1923. The writer has received from numerous workers 
the results of their testing, for which he is very grateful. Several 
interesting things have come to light. 

Standards.—Has the widespread use of the test necessitated any 
radical revision of the mental age norms? A certain modification has 
been found necessary, but not a very radical one. We have now 
results from 29,533 children. These cases have been tabulated sepa- 
rately in three groups as follows: 


oa 6 fe DS ssn p nie 4.9lan Sas eaallice 9,112 
aa wa ES ould ine in hin new eee 9,920 
errr TST se 

cap eek aha nbe aU Ee ceeds ka genn bs ehewwe den 29,533 


The three groups happen to contain about the same number of 
cases, and we shall call the three groups the 1925, the St. Louis,! and 
the 1926 results. Figure 1 shows the median scores for these three 
groups by half-year intervals for as many ages as were available. In 
the very young age groups, age 4 and 44, and in the older age groups 
beyond age 8 the numbers are rather few. These very young and 
older age groups are in general made up of children in the kindergarten 
and first grade, hence they are not truly representative of their ages. 
They represent the young bright and the old dull cases. The fact that 
the curve practically ceases to rise after age 8 shows the influence of the 
old dull in the first and second grades. The St. Louis results are for 
first graders only and the flatness of the curve from age 54% on would 
seem to indicate a careful policy of admission and promotion. In the 
significant ages 5, 6 and 7, for which the test is particularly adapted, 
the three curves are fairly close together. 

From a combination of the three groups into a grand total of 
29,533 cases, by smoothing the curve and disregarding the results for 





1 The writer wishes here to thank Mr. G. R. Johnson of the Division of Tests 


and Measurements of the St. Louis Public Schools for sending him the data of the 
9920 cases tested by his department. 
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the young bright and old dull, we get the following revised norms for 


the various scores on the test: 


PINTNER-CUNNINGHAM PrimaRy MENTAL TEST 


Scorn 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 
43 
44 
45 
46 
47 
48 
49 
50 
51 
52 
53 
54 


July, 1926 
Revised Age Norms, Based on 29,533 Cases 
Score MeEnTAL AGE 
2 3-11 
3 40 
4 4-1 
5 4-3 
6 4-4 
7 4-5 
8 4-6 
9 4-7 
10 4-9 
11 4-10 
12 4-11 
13 5-0 
14 5-1 
15 5-3 
16 5-4 
17 5-5 
18 5-6 
19 5-7 
20 5-8 
21 5-9 
22 5-10 
23 6-0 
24 6-1 
25 6-2 
26 6-3 
27 6-4 
28 6-6 


Relation of Test to Binet.—To what extent is the Pintner-Cunning- 
ham test measuring the same sort of ability as is measured by the 
Binet Test? This can best be answered by correlating the two tests. 
Four investigators have contributed these kind of data: 


1. Stanford-Binet MA with Pintner-Cunningham MA 
2. Stanford-Binet MA with Pintner-Cunningham scores 
3. Stanford-Binet MA with Pintner-Cunningham MA 
4. Stanford-Binet IQ with Pintner-Cunningham IQ 
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MenTAL AGE 


6-7 
6-9 
6-10 
6-11 
7-1 
7-2 
7-3 
7-5 
7-6 
7-7 
7-9 
7-10 
7-11 
8-0 
8-2 
8-3 
8-5 
8-6 
8-7 
8-9 
8-10 
8-11 
9-0 
9-2 
9-3 
9-4 
9-5 
9-7 
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There is evidently a fair correlation between the two tests, espe- 
cially when we remember the restricted range of the mental ages with 
which we are dealing. 

Relation of Test to Other Intelligence Tests.—We have data by means 
of which we can make some comparison of the Pintner-Cunningham 
with other group tests of intelligence. The four correlations are: 


1. With Haggerty delta 2 IQ’s r= .79,n = 33 
2. With Detroit scores r= .67,n = 108 
3. With Rhode Island IQ’s r= .78,n = 51 
4. With Pintner non-language scores r = .69, n = 18 


In general the Pintner-Cunningham test is measuring much the same 
abilities as are the other tests suitable for young children. Interesting 
is the correlation of .69 with the Pintner non-language in spite of the 
fact that this was given in a third grade, a grade somewhat too high 
for the adequate functioning of the Pintner-Cunningham test. 

Reliability or Stability—We have some data on a repetition of the 
test with the same group of children. 


Time INTERVAL Measure Usep r n 
ES, IQ .72 79 
TR cnc ccdessasees secenen IQ .85 23 
eS ee IQ .72 20 
I,” Ui cc caencactcssecesens IQ .73 31 
RE ee .32 16 


With the exception of the last coefficient, we note that in general a 
repetition of the test after several months gives a coefficient of corre- 
lation of about .70 or .80, as in a great number of group tests. The 
writer cannot explain the low coefficient of .32 in the last case. He is 
simply reporting on all the data obtained from various workers. 

Geographical Distribution.—In compiling the 1926 data, it was felt 
that something of interest might arise if the cases were tabulated 
according to the part of the country in which the tests were given. 
There were not sufficient data for a tabulation by states, and hence the 
well-known nine geographical divisions of the census were used. The 
number of cases in these nine divisions using only the age groups from 
5 to 8 inclusive is as follows: 





4 
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GEOGRAPHICAL DrvisI0N N 
I IER MES Ss AEE REN gage ny 57 
II Sg RRR es Pal i i i ta 1718 
Ill Sb wadaa kad Biv. scle sce adausd@Peeren st 782 
IV ren, o's’, US UG se duet web Ohh ebedees os 2385 
V I, aa itl Ew. a glace c Kaldinnde ob a% 6 93 
VI EEE OE DE EE 1894 
VII ek Vo wane side beeenna 560 
VIIl REI eee ar arin a Nt aad oc alee ed TA 1272 
IX A itind «apo che MRA beak ek aitndes ts meas 1032 
wht vite 4 bbkbeemeeeke steam cekinkes deal, 9793 


The largest number of returns was received from the East North 
Central division, followed closely by the West North Central and 
Middle Atlantic. Contrary to expectation the smallest number was 
received from New England. To what extent such distribution of 
returns represents an interest in educational measurement in schools, 
it is impossible to decide. It would seem, however, to indicate very 
roughly such an interest. 

Medians and quartiles have been calculated for each half-year 
group from age 5!4 to 7)4 for each geographical division having more 
than 100 cases in such half-year groups. These medians are as follows: 


AGE-GROUPS 
GBOGRAPHICAL 
DrvisIon 5% 6 634 7 744 
I 
II 21 23 34 33 36 
Ill meer 23 27 29 32 
IV 18 24 29 30 36 
V 
VI 25 31 35 37 .5 39 
Vil ine aon 29 27 
Vill 24 26 30 32 34 
Ix 22 28 iieake 37 38 
n 1424 2060 1729 1580 1056 


It will be noted that the West North Central division makes the 
highest median at all age groups, followed in general by the Pacific 
and Mountain divisions. The South Atlantic and West South Central 
are generally lowest. 

This analysis suggested a comparison of these data with the 
Army data. The ranking of the states in Army Alpha by Alexander! 





1 Alexander, H. B.: A Comparison of the Ranks of American States in Army 
Alpha and in Social-economic Status. School and Society, Vol. XVI, No. 405, 1922, 
pp. 388-392. 








— 
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has been used. Omitting the two divisions, New England and East 
South Central, for which we have inadequate returns, we obtain the 
following ranking of the other divisions: 


AGE-GROUPS 
GBROGRAPHICAL ARMY 
Division 532 6 6% 7 7% Rank 
II 4 5.5 2 3 3.5 5 
Iil es 5.5 6 6 6 7 
IV 5 4 4.5 5 3.5 4 
VI 1 1 1 1 1 3 
Vil - aie 4.5 7 inte 6 
VIII 2 3 3 4 5 2 
IX 3 2 2 2 1 


Direct comparison of ranks on the Pintner-Cunningham Test 
with the Army ranks is not possible except at age 7 where all the divi- 
sions are represented. At other ages we must re-rank the army divi- 
sions in accordance with the divisions represented on the Pintner- 
Cunningham. Nevertheless a cursory inspection of the ranks shows 
much similarity. Rank correlations, re-ranking to allow for omissions, 
show coefficients ranging from .5 to .8. The closest agreement is at 
age 6, where there is the largest number of cases, namely 2060. 

The general picture which these results give us of theintelligence 
of children at the beginning of their school career would seem then to 
resemble the picture of the intelligence of adult men in the several 
geographical divisions of the country. It has been argued that the 
Army Alpha results indicate primarily the efficiency of the schooling 
received by the individual examined and that the geographical com- 
parison of Army Alpha results points more to the relative efficiency of 
school systems than to the inherent intelligence of individuals. Our 
comparison of the Pintner-Cunningham results with the Army Alpha 
would seem, -however, to contradict such an assertion, particularly 
if these data could be increased and if comparisons on other tests led 
in the same direction. A great many of the children considered here 
were just entering school, few of them had had more than two years of 
schooling. We must, therefore, conclude tentatively that the differ- 
ences in schools as between one geographical division and another do 
not seem to be the most significant cause of the difference in intelli- 
gence scores between such divisions, since we find a resemblance between 
the intelligence scores of children just beginning their school career 
and of men who have finished their schooling. 
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As to just what the most significant factor causing such differences 
between geographical divisions may be, our data do not help. The 
environmentalist may abandon his belief in the influence of schooling 
and fall back upon the importance of the first five or six years of pre- 
school life. The hereditarian can only hope to push back such com- 
parisons as we have made to an earlier age and accumulate data for 
infants of twelve months or even six months. 


ANNOUNCEMENT 


The National Society of College Teachers of Education will hold 
its annual meeting in Dallas, Texas, on Monday and Tuesday of the 
week of the meetings of the Department of Superintendence, February 
21-27. President Walter S. Monroe announces that the morning 
meetings will be for members only: that the Monday afternoon meeting 
will be devoted to a discussion of the problem “Graduate Work and 
Research” and that the Tuesday afternoon meeting will be a joint 
session with the Educational Research Association. A large attend- 
ance is expected. The entire program will be announced later. The 
headquarters of the Society in Dallas will be the Hotel Jefferson. 
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INTELLIGENCE TESTING 


What Intelligence Tests Are Based Upon. Joseph Peterson. Industrial 
Psychology, Sept., 1926, 569-579. Various definitions of intelligence are stated 
and an historical survey of the development of measures of intelligence with 
photographs of several leaders is given. 

Stanford-Binet Retests of 441 School Children. Gertrude Hildreth. The 
Pedagogical Seminary and Journal of Genetic Psychology, Sept., 1926, 365-386. 
A study of many factors which bring about IQ fluctuations and the degree of 
constance found by retests. 

Performance Tests for Three, Four and Five Year Old Children. Nancy Bayley. 
The Pedagogical Seminary and Journal of Genetic Psychology, Sept., 1926, 435-454. 
A new performance test is presented with norms based upon 90 to 100 children. 

The Relation of Dentation to Mental Age. Francis J. Perkins. The Peda- 
gogical Seminary and Journal of Genetic Psychology, Sept., 1926, 387-398. A 
study of 555 public school children shows that chronological age is more closely 
related to dentation than is mental age. 

Optimum Difficulty of Group Test Items. Glen W. Cleeton. The Journal of 
Applied Psychology, Sept., 1926, 327-340. It was attempted to discover whether 
or not there was an optimum range of difficulty of test items when used for pre- 
dictive purposes in closely homogeneous groups. 

The Predictive Value of the Yale Classifications Tests. John E. Anderson and 
Llewellyn T. Spencer. School and Society, Sept. 4, 1926, 305-312. Data is 
given for three classes covering the four year period of each class. 

Are Wea Nation of Morons? Florence M. Teagarden. Industrial Psychology, 
Aug., 1926, 535-543. The author discusses definitions of intelligence, its limits 
of growth and social implications. 

Psychological ‘Test Ratings and College Entrance Age. 8S. M. Whinery. School 
and Society, Sept. 18, 1926, 370-372. Students who make highest ratings on 
the psychological entrance examination at Ohio State University are one year and 
four months younger than those who make the lowest ratings. 

The Use of Intelligence Tests by Universities. Herbert A. Toops. School and 
Society, July 17, 1926, 87-88. The findings of a questionnaire study of the work- 
ing principles developed by universities in regard to tests are summarized. 


ACHIEVEMENT TESTING 


Iowa Placement Examinations. George D. Stoddard. School and Society, 
Aug. 14, 1926, 212-216. This paper treats in a general way the purposes, methods 
of construction and reliabilities of these examinations. 

59 











———_ 
signe >) 7 ee 
<i ap 


i PE) RET a 


a en ae ee 


a 
wwindton 





60 The Journal of Educational Psychology 


Analysis of the Iowa Placement Tests. T. A. Langlie. The Journal of Applied 
Psychology, Sept., 1926, 303-314. This is a statistical analysis showing the 
relationships between training tests, aptitude tests and intelligence tests. The 
aptitude tests seem to measure intelligence plus training. 

Do Old and New Type Examinations Measure Different Mental Functions? 
Donald G. Patterson. School and Society, Aug. 21, 1926, 246-248. The average 
intercorrelation (.52) between the old and new type examinations proves to be 
the same as the reliability of the old type. 

Short Answer Examinations in the Social Studies in the Elementary School 
Grades. G.M. Ruch and Others. Public Personnel Studies, Oct., 1926, 274-276. 
Five types of objective examinations are compared as to reliability and validity 
when children are instructed to guess and when instructed not to guess. The 
effect of applying corrections for chance is pointed out. 

Diagnosing Student Shortcomings in English Composition. Walter S. Guiler. 
Journal of Educational Research, Sept., 1926, 112-119. An analysis of the results 
of four diagnostic tests on 103 college students points out the most frequent diffi- 
culties and need for much individual instruction. 

Student Opinions of Types of Examinations. Nira M. Klise. School and 
Society, July 3, 1926, 23-24. The opinions of 265 college freshmen as to the 


fairness of the essay type of examination compared to the more obejctive types 
are summarized. 


PsYcHoLoGy OF LEARNING AND OF SCHOOL SUBJECTS 


Experimental Education and the Nursery School. Arnold Gesell, Journal of 
Educational Research, Sept., 1926, 81-87. The author calls attention to many 
psychological problems in this field which need investigation. 

Talkativeness about, in Relation to Knowledge of, Social Concepts. H. Meltzer. 
The Pedagogical Seminary and Journal of Genetic Psychology, Sept., 1926, 497- 
507. The relationship is shown between children’s grasp of certain concepts 
and the number of words they used in attempting to give their meanings. 

Comparison of the Group and Individual Methods of Teaching Spelling. E. E. 
Keener. The Journal of Educational Method, Sept., 1926; 31-35. The results 
are based upon a controlled experiment with 488 pairs of pupils in Grades II to 
VIII inclusive. 

An Experiment in Vocabulary Building. Clara D. Lebeis. The Journal of 
Educational Method, Sept., 1926, 9-15. Six activities are used with primary 
children as an aid in increasing vocabulary. Samples of results are given. 

The Problem of Learning. Lawrence K. Frank. Psychological Review, Sept., 
1926, 329-351. This is an attempt to reconcile the trial-and-error theory of 
learning with the Gestalt theory. 

Can the High School Pupil Improve His Reading Ability? Dudley H. Miles. 
Journal of Educational Research, Sept., 1926, 88-98. A controlled experiment in 
the improvement of silent reading is reported and the influence of the teacher, 
intelligence and initial ability on gain is pointed out. 

A Study of the Vocabulary of Scientific Articles Appearing in Daily News- 
papers. Francis D. Curtis. School and Society, June 26, 1926, 821-824. The 
percentage of scientific terms in the total vocabulary and in the vocabulary not 
included in the Thorndike list is given from an analysis of 630 newspaper articles. 
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The Effect of Required Themes on Learning. O. A. Ullrich. Journal of Educa- 
tional Research, Nov., 1926, 294-303. The requirement of weekly themes did not 
seem to increase students’ acquisition of information significantly. 

The Effect of Mental-set or Attitude on the Reading Performance of High School 
Pupils. Carter V. Good. Journal of Educational Research, Oct., 1926, 178-186. 
Five forms of a standardized reading test are administered to a freshman class 
with directions so varying as to induce different mental sets. 

A Study of the Phenomenon of Reminiscence. Osborne Williams. Journal of 
Experimental Psychology, Oct., 1926, 368-387. The author reports a study of 
reminiscence and retention of poetry and nonsense syllables having used over 4000 
subjects ranging in age from 9.6 years to adults. 

An Analysis of Study Questions Found in Teatbooks for the Intermediate Grades. 
Nelle E. Moore. The Elementary School Journal, Nov., 1926, 194-208. A 
psychological analysis was made of the questions in 18 texts in reading, history 
and geography. Of the 4228 questions, 67.8 per cent were thought questions. 

Size of Class for Mentally Retarded Children. Alice B. Metzner and Charles 
Scott Berry. The Training School Bulletin, Oct., 1926, 241-251. The conclu- 
sion was reached through experimentation with four groups and a control that 
“‘on the average the classes with an enrollment of 25 pupils made as much progress 
as those with an enrollment of 15, 20, or 22 pupils. 

An Experiment with Manuscript Writing in the Horace Mann School. Edwin H. 
Reeder. Teachers College Record, Nov., 1926, 255-260. This study with three 
fifth grades and four fourth grades seems to show that manuscript writing can be 
speeded up to meet the norms for these grades without undue stress. 

Print-script and Cursive-script in Schools: An Investigation in Nervo-muscular 
Re-adjustments. W.H. Winch. The Forum of Education, June, 1926, 123-138. 
Two experiments are reported of effects upon facility and speed of changing from 
one form of writing to the other. Such changes seem to demand serious adjust- 
ments made early in the child’s school career. 

The Test-study Method versus the Study-test Method in Teaching Spelling. L. R. 
Kilzer. The School Review, Sept., 1926, 521-525. The results of this study 
demonstrate the superiority of that method which first tests the child’s knowledge 
of words and then directs his study efforts to the words incorrectly spelled. 

Improving Handwriting through Diagnosis and Remedial Treatment. Paul V. 
West. Journal of Educational Research, Oct., 1926, 187-198. The experimental 
groups trained by the method of diagnosis and remedial treatment show substantial 
gains over the control groups. 

Solving Arithmetic Problems I. Carleton W. Washburne and Raymond 
Osborne. The Elementary School Journal, Nov., 1926, 219-226. A study is 
made of the causes of pupils’ difficulties through analysis of the results of tests 
given to a large number of children. 

Case Studies in Reading. Joseph A. Baer. Educational Research Bulletin, 
Oct., 1926, 319-328. An outline to aid the teacher in making case studies in 
reading is offered. 

A Study of Pupil Errors in Chemistry. J.C. Bennett. Journal of Educational 
Research, Nov., 1926, 275-283. Results of 1470 pupils on Power’s General 
Chemistry Test are summarized and an analysis made of certain questions in order 
to determine the causes of errors. 
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The Effect of Group Activity and Individual Effort in Developing Ability to Solve 
Problems in First Year Algebra. W. A. Barton, Jr. Educational Administration 
and Supervision, Nov., 1926, 512-518. As a result of an equivalent groups 
experiment the author concludes that the group discussion method is superior for 
children of normal or superior intelligence. 

A Vocabulary of Scientific Terms for High School Students. S. R. Powers. 
Teachers College Record, Nov., 1926, 220-245. The vocabulary presented here 
is composed of 1828 words whose importance has been determined by range and 
frequency of their occurrence in science texts and miscellaneous materials. 

The Commonest Syllables. Carleton Washburne. Journal of Educational 
Research, Oct., 1926, 199-205. The syllables of words in Thorndike’s The Teach- 
' er’s Word Book were tabulated and ranked, (1) according to the number of words 
in which found and, (2) according to the sum of the indices of words in which found. 

Testing Vocabulary in History. Mary 8. Gold. The Historical Outlook, Oct. 
1926, 285-291. The author presents an unstandardized test which should prove 
useful in testing vocabulary in Ancient and Medieval History courses. 

Size of Recognition and Recall Vocabularies. Percival M. Symonds. School 
and Society, Oct., 30, 1926, 559-560. The recognition and recall vocabularies of 
ninth grade children were extensively tested. Estimates of the size of the recogni- 
tion vocabulary at different grade levels are offered. 


TEACHERS Marks 


A National Survey of the Grading of College Freshmen Composition. H. W. 
Jarnes. The English Journal, Oct., 1926, 579-587. Variability is shown in the 
grades assigned by many college instructors and suggestions made for remedying 
this condition. 

Comparison of Teacher and Student Estimates of Grades. Paul L. Whitley, 
School and Society, Aug. 28, 1926, 278-280. ‘‘The average agreement of rat- 
ings made by students compares favorably with the ratings of the instructor.” 

A Method of Computing Accomplishment Quotients on the High School and Col- 
lege Levels. Charles C. Peters. Journal of Educational Research, Sept., 1926, 
99-111. A formula is proposed for use on these higher levels and its application 
to 1300 college students discussed. 


MENTAL MEASUREMENTS 


Racial Differences in the Intelligence of School Children. Florence L. Good- 
enough Journal of Experimental Psychology, Oct., 1926, 388-397. A tabular 
summary of previous studies in this field is given together with the results of 2457 
children of foreign parentage on the Goodenough Intelligence Test. 

Distribution of Intelligence Quotients of Twenty-two Thousand Primary School 
Children. Lexie Strachan. Journal of Educational Research, Oct., 1926, 169-171. 
Distributions are given for white, negro and foreign children and comparisons 
are made with Terman’s data. 

An Inquiry into the Relative Values of the Inventive and Selective Forms of Group 
Tests of Mental Capacity. J. B.Cannon. The Australian Journal of Psychology 
and Philosophy, June, 1926, 141-149. The form of objective test in which the 
child selects the correct response from several responses is shown to be as valid and 
reliable a measure as the form in which the child must invent the correct response. 
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The Intelligence of Indian Children. J. A. Fitzgerald and W. W. Ludeman. 
The Journal of Comparative Psychology, Aug., 1926, 319-328. From an analysis 
of results of standard intelligence tests the authors point out various possible 
causes for Indian children being below the norms. 

The Correlation between Intelligence and Size of Family. H. E. G. Sutherland 
and Godfrey H. Thompson. The British Journal of Psychology, Oct., 1926, 6-92. 
As a result of the careful collection and analysis of data it is concluded that among 
unselected children there is a correlation of approximately —0.2 between size of 
family and intelligence of its members. 

The Constancy of “‘g,’’ General Intelligence. C. S. Slocombe. The British 
Journal of Psychology, Oct., 1926, 93-110. This study adds further evidence to 
the constancy of general intelligence. 

The Constancy of the Intelligence Quotient—Final Results. P.L. Gray and R. E. 
Marsden. The British Journal of Psychology, July, 1926. Results of annual 
retests for from one to five years show that there is no marked median change in 


IQ. 

A Comparison of Group Verbal and Pictorial Tests of Intelligence. Constance M. 
Davey. The British Journal of Psychology, July, 1926, 27-28. Six pictorial 
tests were constructed and compared with various verbal tests. The conclusion 
is reached that ‘‘we find no evidence to support the claim made by some psycholo- 
gists that the non-verbal test is the better measure of ‘general intelligence’ because 
it excludes the language factor.” 


EDUCATIONAL MEASUREMENT 


A Method of Determining Composite Educational Scores. Marvin L. Darsie. 
Journal of Educational Research, Nov., 1926, 270-274. A formula is suggested 
for computing composite scores so constructed that the relative educational 
importance of various subjects will be taken into account. 

A Modified Form of the True-false Test. Howard Y. McClusky and Francis D. 
Curtis. Journal of Educational Research, Oct., 1926, 213-224. This modified 
form involves the detection of the erroneous elements and the substitution of 
correct elements. 

The Correction of Constant Eerrors in College Marks. Robert 8. Ellis. School 
and Society, Oct. 2, 1926, 432-436. This is a method of finding a correction for the 
marks in each subject by finding their deviations from the combination of average 
class marks and average pychological scores. 

The Influence of Family on School Marks. Charles H. Griffitts. School and 
Society, Dec. 4, 1926, 713-716. The average school marks of different pairs of 
siblings are compared and dfferences beween the averages of children of small 
families and those of large families are pointed out. 

The Reliability of Essay Marks. Godfrey H. Thompson and Stella M. Bailes. 
The Forum of Education, June, 1926, 85-91. A number of essays were marked by 
different judges. The reliabilities obtained were higher than most such experi- 
ments show (.50—.80), but it is pointed out that they are comparatively much lower 
than items on objective tests. 

Some Comparison of Freshmen Boys and Girls. C.C.Crawford. School and 
Society, Oct. 6, 1926, 494-496. This is an attempt to account, in part, for the 
higher average grade of girls than boys in the University of Idaho. 
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MEASUREMENT OF TEACHING EFFICIENCY 


An Experiment with Standardized Tests in a State Teachers’ Examination. 
L. V. Cavins. Journal of Educational Research, Oct., 1926, 206-212. Results 
of the use of standardized tests in composition, history and geography for the 
certification of teachers are reported. 

The Relative Worth of Short Answer and Free Answer Material in Elementary 
Teacher Tests. William A. Hannig. Public Personnel Studies, Oct. 1926, 277- 
278. The free answer, true-false and completion types of tests are compared as 
means of selecting and ranking teachers. 

_ Standardized Tests for Elementary Teacher. F. B. Knight, G. M. Ruch, J. E. 

Bathurst and Fred Telford. Public Personnel Studies, Oct., 1926, 279-298. The 
methods of construction, reliabilities and validities of these tests for predicting 
teaching success are given. Copies of the tests are appended. 

A Conduct Scale for the Measurement of Teaching. Ellsworth Collings. The 
Journal of Educational Method, Nov., 1926, 97-103. This is a scale for measuring 
purposeful activity of children. 


CHARACTER AND PERSONALITY 


Character Tests and Measurements. Report of the subcommitte on Character 
Tests and Measurements of the Committee on Character Education of the N.E.A., 
Edwin D. Starbuck, chairman. U.S. Department of Interior, Bureau of Edu- 
cation Bulletin (1926) No. 7, Chapter V. This is a classification of the principal 
investigations in this field with a bibliography of 153 titles. 

A Measurement of Sociability. A. R. Gilliland and Ruth S. Burke. The 
Journal of Applied Psychology, Sept., 1926, 315-326. Ability to remember photo- 
graphs and connect with them ‘significant facts together with questionnaire were 
used to measure sociability. 

Personal Estimates of Character Traits. Richard 8. Uhrbrock. The Peda- 
gogical Seminary and Journal of Genetic Psychology, Sept., 1926, 491-496. Sig- 
nificant differences are pointed out between the self ratings of college men and 
women. 

How Personalities Are Found in Industry. Donald A. Laird. Industrial 
Psychology, Oct., 1926, 654-662. The development and use of a rating scale to 
measure introversion and extroversion is reported. 

Testing the Knowledge of Right and Wrong. Hugh Hartshorne and Mark A. 
May. Religious Education, Apr., 1926, 239-252. This is a statistical analysis 
of the preliminary forms of the moral knowledge tests being constructed for the 
Character Education Inquiry. 

What Constitutes Campus Popularity in Course or Individual? Emily 8. Dexter. 
School and Society, June 12, 1926, 758-760. Over two hundred girls of Agnes 
Scott College gave opinions in regard to traits in teachers and fellow students 
which make for popularity. 

Personality as ‘‘ Habit Organization.”” H. E. Garrett. The Journal of Abnor- 
mal and Social Psychology, Oct., Dec., 1926, 250-255. An attempt was made to 
measure the degree of integration of fundamental habit systems by use of (ratings 
on a habit system chart. 
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An Investigation into the Development of the Moral Conceptions of Children. 
Part II. Eve Macaulay and Stanley H. Watkins. The Forum of Education, 
June, 1926, 92-108. An analysis is made of the results obtained when children 
were asked to name the persons they most wished to resemble. 


Case StTupDiIEs 


Phenomenal Memorizing as a “Special Ability.”” Harold Ellis Jones. The 
Journal of Applied Psychology, Sept., 1926, 367-377. A case study of remark- 
able memory for statistical facts and criticisms of the theory that the feebleminded 
tend to show a high development of ‘‘memory.”’ 

Just Naturally Bad. Howard E. Signor. Industrial Psychology, Nov., 1926, 
712-716. Three case studies from the Juvenile Adjustment Agency of Toledo. 


MISCELLANEOUS 


The Teaching of Psychology in Teacher-training Institutions of the South. Joseph 
Peterson and Gladys Dunkle. Psychologically Review, Sept., 1926, 385-396. <A 
questionnaire study of the preparation of teachers of psychology is summarized. 

The School Child’s Choice of Companions. Beth Wellman. Journal of Edu- 
cational Research, Sept., 1926, 126-132. The mental and physical character- 
istics of 27 pairs of girls and 29 pairs of boys are compared. 

Community Differences in Play Behavior. Harvey C. Lehman. The Peda- 
gogical Seminary and Journal of Genetic Psychology, Sept., 1926, 477-490. It 
is shown that the play activities of communities and groups in the same com- 
munity differ considerably. 

A Comparison of the Play Activities of Town and Country Children. Harvey C. 
Lehman. The Pedagogical Seminary and Journal of Genetic Psychology, Sept., 
1926, 455-476. Some of the major differences in the types of activities engaged 
in are pointed out. 

Table of Standard Errors and Probable Errors of Percentages for Varying Num- 
bers of Cases. Harold A. Edgerton and Donald G. Patterson. The Journal of 
Applied Psychology, Sept., 1926, 378-391. This table is based upon the formula 
for the standard error of a proportion and covers a range in the number of cases 
from 25 to 1,000,000. 

Methods of Investigation of Study Habits. Percival Symonds. School and 
Society, July 31, 1926, 145-152. This is a critical review of the methods which 
have been employed in discovering the study habits of school pupils. 

What Do University Students Read? Henry O. Severance. School and Society, 
June 5, 1926, 726-728. Records were kept by librarians of books borrowed and 
magazines read by university students for recreational reading. 
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_Cutipren’s Drawine ABILITY AS A MEASURE OF INTELLIGENCE 


Measurement of Intelligence by Drawings, by Florence L. Goodenough. 
Yonkers N. Y.: World Book Company, 1926. Pp. 177. $1.80. 


To the present-day objective literature of child development, Dr. 
Goodenough has added a typical and valuable study. Planned as a 
group scale for measuring intelligence in the kindergarten and primary 
grades, the materials of the scale and the incidental products of the 
research suggest many interesting problems in the psychology of 
early childhood. 

The study is based upon the thesis that intellectual growth during 
the years from four to ten is closely paralleled by growth in expressive 
drawing ability. In this connection drawing as a means of expression 
is not to be confused with artistic creation as judged from a purely 
esthetic viewpoint. 

One chapter in the book, devoted to a brief summary of studies 
during the past forty years, furnishes material to support the thesis. 
If this chapter is to be used as a bibliography, the omission of reference 
to Dr. Stella McCarty’s coincident study, Children’s Drawings, 
published in 1924 is to be regretted. 

The scale presented is based upon the drawing of aman. The final 
scale is the result of minute analysis of many drawings classified accord- 
ing to age and normal grade placement. It is standardized upon over 
3000 drawings of children between the ages of four and ten. The 
average correlation with the Binet Scale is .76. The PE of estimate 
of IQis 5.4 points. The reliability of the scale varies from .80 to .90. 
Judged statistically the scale would seem to compare very favorably 
with other group tests for children of the ages studied. 

Some points in favor of such a test are the ease of giving due to the 
fact that a minimum of language comprehension is necessary, the 


relatively short time necessary to give the test, and the universality of 
appeal of the subject. 
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Possibility of practice effect is shown to be rather negligible as far 
as schoolroom experience is concerned. With this question in mind 
one class of first grade children were coached and the results of later 
tests tabulated. In individual instances home practice might influence 
the score but this factor would appear not to affect materially the 
ratings of any one group of children. 

The book is so planned that an experienced tester may use a series 
of specimen drawings for practice scoring. While the details of the 
51 items seem to be very complicated, results have shown that a skilled 
scorer is able with practice materially to increase his speed. 

This scale is a valuable contribution to the field of measurements, 
not only as a supplement to existing tests but as a means of reaching 
the foreign or deaf child. Bess V. CuNNINGHAM. 
Teachers College, Columbia University. 





ANALYSIS OF INTELLIGENCE TESTS 


Effect of Age and Experience on Tests of Intelligence, by Vernon A. 
Jones. New York: Teachers College, Columbia University, 
Contributions to Education, No. 203, 1926. Pp. 74. 


A Study of the Nature of Difficulty, by Jacob S. Orleans. New York: 
Teachers College, Columbia University, Contributions to Educa- 
tion, No. 206, 1926. Pp. V+ 39. 


Much of the criticism which has been hurled against the testing 
movement has been just. This critical attitude has been stimulated 
by the frequent failures of test makers to demonstrate the validities 
and reliabilities of their tests. One of the most hopeful signs in 
this movement is that there is developing an increasing tendency 
critically to evaluate the elements which enter into each new standard- 
ized test, as well as to demonstrate the accuracy of the measuring 
instrument as a whole. . 

Two recent studies contribute to this general field. Jones set 
out to evaluate widely used sub-tests in order to determine the effect 
of age and experience upon children’s responses to them. 

The subjects of the investigation were 487 children ranging from 
grades three to seven and differing widely in environment. The sub- 
tests evaluated were selected from the National A and B, Haggerty 
Delta 2 and Otis S-A Advanced and Primary intelligence tests. 
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A mental age was determined for each child; then, by use of the 
partial correlation technique, each sub-test was correlated against 
mental age and against chronological age. These two coefficients 
were taken as criteria of the goodness of a sub-test as a measure of 
intelligence and combined into a single measure by a special use of 
the coefficient of alienation. By use of this criterion the sub-tests 
were ranked on the basis of their excellence. 

It is demonstrated that a sub-test may be a more adequate measure 
of intelligence within certain limits on a distribution than within other 
‘ limits. It is shown that the older child of any mental age group is 
favored more than the younger one by such sub-tests as are partly 
dependent upon chronological age. Sub-tests which ranked low were 
influenced by environmental factors as well as CA. 

Assumptions made are clearly stated and statistical procedures 
defended. The necessity of the application of the Blakeman formula 
and the correction to allow for the fact that each sub-test contributed 
toward the mental age may be questioned on practical grounds but this 
study is more important as a contribution in method than for the 
specific ranks determined for sub-tests or the exact amounts which 
chronological age and environmental factors influence a mental age. 
It well deserves the careful study of experts interested in improving 
measures of mental ability. 

Orleans demonstrates one method of investigating difficulty by 
making an analysis of the nature of situations in relation to their 
difficulty. Such an analysis is justified by the author as follows: 
“Such knowledge allows for an analysis of learning and how it func- 
tions. It is desirable to know what subject-matter is suitable for 
different mental levels and different individuals. More desirable 
is knowledge of why such subject-matter is suitable for such levels and 
such individuals.” 

The situations used in this study were 476 elements of five stand- 
ardized tests. The difficulty value of each element was represented 
by the per cent of correct response. This per cent represents the 
‘total score obtained by each age or grade group divided by the total 
obtainable score for that group.”’ 

Seven factors of traits which were thought to be related to difficulty 
were selected and each element was rated for each trait by five expert 
judges. Correlations between each trait and difficulty for various 
grade, chronological age, and mental age groups were determined to 
demonstrate the nature of difficulty. The seven traits were com- 
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plexity, elaborateness, abstractness, symbolicness, obviousness, famili- 
arity and intrinsic interest. 

The validity of the judgments of traits is demonstrated but their 
reliability is not shown beyond the statement that all judges were 
experts. The reliability of the difficulty measures is indicated by 
the probable errors of the per cents. A table of these probable errors 
for populations ranging from 50 to 500 is given and this table should 
prove useful to others dealing with per cent scores. 

The principal facts demonstrated are that complexity, obviousness 
and familiarity ‘‘are, in general the most highly correlated with diffi- 
culty, the correlations being about 60;’’ the other traits are only slightly 
or not at all related to difficulty; ‘for subnormal children, complexity 
is the outstanding factor in causing difficulty.” 

Several rather serious errors seem to have been made in the state- 
ments at the bottom of page 23 and the top of page 24 concerning the 
relationships between difficulty and certain traits. These statements 
are opposite to the results shown in Tables VII and X. These errors 
may have been due to the fact that for practical reasons the signs of 
the tables were reversed. Also the reference to Table VIII at the 
bottom of page 23 probably means Table VII. 

The investigation shows careful work and indicates another 
method of making measures of intelligence more discriminating and 
valuable. C. O. MaTHEWs. 
Institute of Educational Research, Lincoln School of Teachers College. 





EVALUATING THE LOGICIANS 


The Psychology of Reasoning by Miriam Frances Dunn, Volume I, 
Number 1 of “‘Studies in Psychology and Psychiatry from the Catholic 
University of America’? edited by Edward A. Pace. Baltimore; 
Williams and Wilkins, June, 1926. Pp. 140. 


Miss Dunn’s study of reasoning inaugurates a new series of publi- 
cations of the Catholic University in the closely allied fields of Psy- 
chology and psychiatry. Future numbers, to be published at irregular 
intervals, will contain monographs and results of individual research. 

This latest ‘‘Psychology of Reasoning” attacks the logicians on 
their own battlefield, introspection. After a careful and somewhat 
extended review of the philosophical, empirical and logical theories of 
reasoning, the monograph proceeds to an account of 20 experimental 
cases of reasoning, reported by different subjects. The problems used 
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were legal cases and theorems of geometry. In addition to the intro- 
spective records made spontaneously by the subjects, a long series of 
questions was asked designed to clarify the chief points of accord and 
of disagreement among the logicians. The experimenter and all the 
subjects assumed the syllogistic form of reasoning, and the responses 
and conclusions are in the language of formal logic. The results show 
diversity rather than order, and lead to the conclusion that “there 
seems to be no account, universally holding in all its details, of such a 
complicated process as that of reasoning.’”’ Reasoning occurs in 
‘ varied forms, a number of which are enumerated. The outstanding 
trend, however, seems to have a hint of agreement with the theories of 
John Dewey. ‘The order in which the premises arise in the subject’s 
mind is immaterial, but the premises must always be taken in conjunc- 
tion, inasmuch as a relationship, necessary for the progress or the 
argumentation, always exists between them.” And again, “‘In the 
majority of cases the most important process was the perception of 
the relationship existing between the two premises, being in many 
of the [legal] cases that ofthe inter action of cases and law . . . In 
the formulation of the premises analysis and synthesis functioned in 
about half of the cases.” 

One of the most valuable sections of the monograph is devoted to 
the answers that the logical writers have given to the questions used in 
the experiment, comparing these with the results obtained. One who 
has struggled through Mill, Bradley or Binet will appreciate this 
orderly comparison of their theories of reasoning. Even though the 
practical application of the introspective method to Educational 
Psychology is very limited, the points of agreement of these thinkers 
undoubtedly furnish a basis of some value for further experimental 
attack on reasoning, and hence on the involved questions of problem 
solving and informational learning. LAURANCE F. SHAFFER. 
Institute of Educational Research, Lincoln School of Teachers College. 





OTHER PUBLICATIONS RECEIVED 


A. PUBLICATIONS IN EDUCATIONAL PsyCHOLOGY 


Cuitp Stupy AssocriATION oF AMERICA, GRUENBERG, BENJAMIN 
C., Editor: Guidance of Childhood and Youth, Readings in Child Siudy. 
New York: The Macmillan Company, 1926, pp. 324. 

Cox, CATHARINE M.: Early Mental Traits of Three Hundred Geni- 
uses. Genetic Studies of Genius, Volume II. Stanford University 
Press, 1926, pp. 842. 
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WEIDMANN, CHARLEs C.: How to Construct the True-false Examina- 
tion. New York: Teachers College, Columbia University, 1926, pp. 
118. 


B. PuBLICATIONS IN PsYyCHOLOGY 


Ravutu, JoHn W.: Diastatic Activity of the Blood Serum in Mental 
Disorders, Studies in Psychology and Psychiatry, Volume I, Number 
2, June, 1926. Baltimore; The Williams and Wilkins Company, 
pp. 32. 

Taytor, W. S.: Readings in Abnormal Psychology and Mental 
Hygiene. New York: D. Appleton and Company, 1926. pp. 789. 

Houuineworth, H. L.: The Psychology of Thought. New York: 
D. Appleton and Company, 1926. pp. 329. 


C. PUBLICATIONS IN THE GENERAL EDUCATIONAL FIELD 


BLAKE, MABELLE B.: Guidance for College Women. New York: 
D. Appleton and Company, 1926, pp. 285. 

BLANKENSHIP, ALBERT S.: The Accessibility of Rural Schoolhouses 
in Texas. New York City: Teachers College, Columbia University, 
1926, pp. 62. 

Day, Mary §.: Scheubel as an Algebraist. New York City: 
Teachers College, Columbia University, 1926, pp. 168. 

Jos, LEONARD B.: Business Management of Institutional Homes for 
Children. New York City: Teachers College, Columbia University, 
1926, pp. 205. 

Lerrico, Marion O. Health Problem Sources. New York City: 
Teachers College, Columbia University, 1926, pp. 151. 

McHate, Katuryn: Comparative Psychology and Hygiene of the 
Over-weight Child. New York City: Teachers College, Columbia 
University, 1926, pp. 123. 

Myers, Atonzo F. and Brerecuen, Epira E.: Manual of Observa- 
tion and Participation. New York: American Book Company, 1926, 
pp. 263. 

SaxMaNn, Etuet J.: Students’ Use in Leisure Time of Activities 
Learned in Physical Education in State Teachers College. New York 
City: Teachers College, Columbia University, 1926, pp. 90. 

ScHMALHAUSEN, SAMUEL D.: Humanizing Education. New York: 
The New Education Publishing Company, 1926, pp. 343. 

Taytor, Rosert B.: Principles of School Supply Management. 
New York City: Teachers College, Columbia University, 1926, pp. 
145. 








72 The Journal of Educational Psychology 


TotaH, Kuauit A.: Contribution of the Arabs to Education. New 
York City: Teachers College, Columbia University, 1926, pp. 105. 

Wine, Martruew H.: Valid Diagnosis in High School Composi- 
tion. New York City: Teachers College, Columbia University, 1926, 
pp. 64. 

Witson, Guy M.: What Arithmetic Shall We Reach? Boston: 
Houghton Mifflin Company, 1926, pp. 149. 


D. New Scuoou TExTBOoKS 


CHAMBERLAIN, JAMES F., and CHAMBERLAIN, ARTHUR H.: South 
America, A Supplementary Geography. New York: The Macmillan 
Company, 1926, pp. 203. 

Dickson, MARGUERITE 8.: American History for Grammar Schools. 
Revised Edition. New York: The Macmillan Company, 1926, pp. 
655. 


E. OTHER PUBLICATIONS 


RILEY, WooDBRIDGE: From Myth to Reason. New York: D. Apple- 
ton and Company, 1926, pp. 327 

ScHAUFFLER, Henry P.; Adventures in Habit-craft, Character in 
the Making. New York: The Macmillan Company, 1926, pp. 164. 


F. New STANDARDIZED TESTS 


H. R. Srerves, ALLAN AsBott, and Ben D. Woop, Authors: 
English Test, Columbia Research Bureau. Yonkers: World Book 
Company, 1926. 

A. A. Meras, Suzanne Rotu, and Ben D. Woop, Authors: 
French Test, Columbia Research Bureau. Yonkers: World Book 
Company, 1926. 

C. M. Purtn and Ben D. Woop, Authors: German Test, Colum- 
bia Research Bureau. Yonkers: World Book Company, 1926. 

HERMAN W. FarRwELL and BEN D. Woop, Authors: Physics Test, 
Columbia Research Bureau. Yonkers: World Book Company, 1926. 

HerBert E. Hawkes and Ben D. Woon, Authors: Plane Geometry, 
Columbia Research Bureau. Yonkers: World Book Company, 1926. 

FRANK Caticott and Ben D. Woop, Authors: Spanish Test, 
Columbia Research Bureau. Yonkers: World Book Company, 1926. 

KinG, FLorencE B., and CLark, HAROLD F.:Foods Test. Yonkers: 
World Book Company, 1926. 

ORLEANS, Jacos §., and SoLoMAN, MicHasE.: Latin Prognosis Test. 
Yonkers: World Book Company, 1926. 
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